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FOREWORD 


I n bygone centuries, our physical world appeared to be filled to the brim with mysteries. Divine powers 
could provide for genuine miracles; water and sunlight could turn arid land into fertile pastures, but the 
same powers could lead to miseries and disasters. The force of life, the vis vitalis, was assumed to be the 
special agent responsible for all living things. The heavens, whatever they were for, contained stars and other 
heavenly bodies that were the exclusive domain of the Gods. 

Mathematics did exist, of course. Indeed, there was one aspect of our physical world that was recognised to 
be controlled by precise, mathematical logic: the geometric structure of space, elaborated to become a genuine 
form of art by the ancient Greeks. From my perspective, the Greeks were the first practitioners of ‘mathematical 
physics’, when they discovered that all geometric features of space could be reduced to a small number of 
axioms. Today, these would be called ‘fundamental laws of physics’. The fact that the flow of time could be 
addressed with similar exactitude, and that it could be handled geometrically together with space, was only 
recognised much later. And, yes, there were a few crazy people who were interested in the magic of numbers, 
but the real world around us seemed to contain so much more that was way beyond our capacities of analysis. 

Gradually, all this changed. The Moon and the planets appeared to follow geometrical laws. Galilei and 
Newton managed to identify their logical rules of motion, and by noting that the concept of mass could be 
applied to things in the sky just like apples and cannon balls on Earth, they made the sky a little bit more 
accessible to us. Electricity, magnetism, light and sound were also found to behave in complete accordance 
with mathematical equations. 

Yet all of this was just a beginning. The real changes came with the twentieth century. A completely new 
way of thinking, by emphasizing mathematical, logical analysis rather than empirical evidence, was pioneered 
by Albert Einstein. Applying advanced mathematical concepts, only known to a few pure mathematicians, to 
notions as mundane as space and time, was new to the physicists of his time. Einstein himself had a hard 
time struggling through the logic of connections and curvatures, notions that were totally new to him, but are 
only too familiar to students of mathematical physics today. Indeed, there is no better testimony of Einstein’s 
deep insights at that time, than the fact that we now teach these things regularly in our university classrooms. 

Special and general relativity are only small corners of the realm of modern physics that is presently being 
studied using advanced mathematical methods. We have notoriously complex subjects such as phase transitions in 
condensed matter physics, superconductivity, Bose-Einstein condensation, the quantum Hall effect, particularly 
the fractional quantum Hall effect, and numerous topics from elementary particle physics, ranging from fibre 
bundles and renormalization groups to supergravity, algebraic topology, superstring theory, Calabi-Yau spaces 
and what not, all of which require the utmost of our mental skills to comprehend them. 

The most bewildering observation that we make today is that it seems that our entire physical world 
appears to be controlled by mathematical equations, and these are not just sloppy and debatable models, but 
precisely documented properties of materials, of systems, and of phenomena in all echelons of our universe. 

Does this really apply to our entire world, or only to parts of it? Do features, notions, entities exist that are 
emphatically not mathematical? What about intuition, or dreams, and what about consciousness? What 
about religion? Here, most of us would say, one should not even try to apply mathematical analysis, although 
even here, some brave social scientists are making attempts at coordinating rational approaches. 
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No, there are clear and important differences between the physical world and the mathematical world. 
Where the physical world stands out is the fact that it refers to ‘reality’, whatever ‘reality’ is. Mathematics is 
the world of pure logic and pure reasoning. In physics, it is the experimental evidence that ultimately decides 
whether a theory is acceptable or not. Also, the methodology in physics is different. 

A beautiful example is the serendipitous discovery of superconductivity. In 1911, the Dutch physicist Heike 
Kamerlingh Onnes was the first to achieve the liquefaction of helium, for which a temperature below 4.25 K 
had to be realized. Heike decided to measure the specific conductivity of mercury, a metal that is frozen solid 
at such low temperatures. But something appeared to go wrong during the measurements, since the volt 
meter did not show any voltage at all. All experienced physicists in the team assumed that they were dealing 
with a malfunction. It would not have been the first time for a short circuit to occur in the electrical 
equipment, but, this time, in spite of several efforts, they failed to locate it. One of the assistants was 
responsible for keeping the temperature of the sample well within that of liquid helium, a dull job, requiring 
nothing else than continuously watching some dials. During one of the many tests, however, he dozed off. 
The temperature rose, and suddenly the measurements showed the normal values again. It then occurred to 
the investigators that the effect and its temperature dependence were completely reproducible. Below 4.19 
degrees Kelvin the conductivity of mercury appeared to be strictly infinite. Above that temperature, it is 
finite, and the transition is a very sudden one. Superconductivity was discovered (D. van Delft, “Heike 
Kamerling Onnes", Uitgeverij Bert Bakker, Amsterdam, 2005 (in Dutch)). 

This is not the way mathematical discoveries are made. Theorems are not produced by assistants falling 
asleep, even if examples do exist of incidents involving some miraculous fortune. 

The hybrid science of mathematical physics is a very curious one. Some of the topics in this Encyclopedia 
are undoubtedly physical. High T; superconductivity, breaking water waves, and magneto-hydrodynamics, 
are definitely topics of physics where experimental data are considered more decisive than any high-brow 
theory. Cohomology theory, Donaldson—Witten theory, and AdS/CFT correspondence, however, are examples 
of purely mathematical exercises, even if these subjects, like all of the others in this compilation, are strongly 
inspired by, and related to, questions posed in physics. 

It is inevitable, in a compilation of a large number of short articles with many different authors, to see quite a 
bit of variation in style and level. In this Encyclopedia, theoretical physicists as well as mathematicians together 
made a huge effort to present in a concise and understandable manner their vision on numerous important 
issues in advanced mathematical physics. All include references for further reading. We hope and expect that 
these efforts will serve a good purpose. 


Gerard 't Hooft, 
Spinoza Institute, 


Utrecht University, 
The Netherlands. 


PREFACE 


athematical Physics as a distinct discipline is relatively new. The International Association of 

Mathematical Physics was founded only in 1976. The interaction between physics and mathematics 
has, of course, existed since ancient times, but the recent decades, perhaps partly because we are living 
through them, appear to have witnessed tremendous progress, yielding new results and insights at a dizzying 
pace, so much so that an encyclopedia seems now needed to collate the gathered knowledge. 

Mathematical Physics brings together the two great disciplines of Mathematics and Physics to the benefit of 
both, the relationship between them being symbiotic. On the one hand, it uses mathematics as a tool to 
organize physical ideas of increasing precision and complexity, and on the other it draws on the questions 
that physicists pose as a source of inspiration to mathematicians. A classical example of this relationship 
exists in Einstein’s theory of relativity, where differential geometry played an essential role in the formulation 
of the physical theory while the problems raised by the ensuing physics have in turn boosted the development 
of differential geometry. It is indeed a happy coincidence that we are writing now a preface to an 
encyclopedia of mathematical physics in the centenary of Einstein’s annus mirabilis. 

The project of putting together an encyclopedia of mathematical physics looked, and still looks, to us a 
formidable enterprise. We would never have had the courage to undertake such a task if we did not believe, 
first, that it is worthwhile and of benefit to the community, and second, that we would get the much-needed 
support from our colleagues. And this support we did get, in the form of advice, encouragement, and 
practical help too, from members of our Editorial Advisory Board, from our authors, and from others as well, 
who have given unstintingly so much of their time to help us shape this Encyclopedia. 

Mathematical Physics being a relatively new subject, it is not yet clearly delineated and could mean 
different things to different people. In our choice of topics, we were guided in part by the programs of recent 
International Congresses on Mathematical Physics, but mainly by the advice from our Editorial Advisory 
Board and from our authors. The limitations of space and time, as well as our own limitations, necessitated 
the omission of certain topics, but we have tried to include all that we believe to be core subjects and to cover 
as much as possible the most active areas. 

Our subject being interdisciplinary, we think it appropriate that the Encyclopedia should have certain 
special features. Applications of the same mathematical theory, for instance, to different problems in physics 
will have different emphasis and treatment. By the same token, the same problem in physics can draw upon 
resources from different mathematical fields. This is why we divide the Encyclopedia into two broad sections: 
physics subjects and related mathematical subjects. Articles in either section are deliberately allowed a fair 
amount of overlap with one another and many articles will appear under more than one heading, but all are 
linked together by elaborate cross referencing. We think this gives a better picture of the subject as a whole 
and will serve better a community of researchers from widely scattered yet related fields. 

The Encyclopedia is intended primarily for experienced researchers but should be of use also to beginning 
graduate students. For the latter category of readers, we have included eight elementary introductory articles for easy 
reference, with those on mathematics aimed at physics graduates and those on physics aimed at mathematics 
graduates, so that these articles can serve as their first port of call to enable them to embark on any of the main 
articles without the need to consult other material beforehand. In fact, we think these articles may even form the 
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foundation of advanced undergraduate courses, as we know that some authors have already made such use of them. 

In addition to the printed version, an on-line version of the Encyclopedia is planned, which will allow both 
the contents and the articles themselves to be updated if and when the occasion arises. This is probably a 
necessary provision in such a rapidly advancing field. 

This project was some four years in the making. Our foremost thanks at its completion go to the members 
of our Editorial Advisory Board, who have advised, helped and encouraged us all along, and to all our 
authors who have so generously devoted so much of their time to writing these articles and given us much 
useful advice as well. We ourselves have learnt a lot from these colleagues, and made some wonderful 
contacts with some among them. Special thanks are due also to Arthur Greenspoon whose technical expertise 
was indispensable. 

The project was started with Academic Press, which was later taken over by Elsevier. We thank warmly 
members of their staff who have made this transition admirably seamless and gone on to assist us greatly in 
our task: both Carey Chapman and Anne Guillaume, who were in charge of the whole project and have been 
with us since the beginning, and Edward Taylor responsible for the copy-editing. And Martin Ruck, who 
manages to keep an overwhelming amount of details constantly at his fingertips, and who is never known to 
have lost a single email, deserves a very special mention. 

As a postscript, we would like to express our gratitude to the very large number of authors who generously 
agreed to donate their honorariums to support the Committee for Developing Countries of the European 
Mathematical Society in their work to help our less fortunate colleagues in the developing world. 


Jean-Pierre Francoise 
Gregory L. Naber 
Tsou Sheung Tsun 


GUIDE TO USE OF THE ENCYCLOPEDIA 


Structure of the Encyclopedia 


The material in this Encyclopedia is organised into two sections. At the start of Volume 1 are eight Introductory Articles. 
The introductory articles on mathematics are aimed at physics graduates; those on physics are aimed at mathematics 
graduates. It is intended that these articles should serve as the first port of call for graduate students, to enable them to 
embark on any of the main entries without the need to consult other material beforehand. 

Following the Introductory Articles, the main body of the Encyclopedia is arranged as a series of entries in alphabetical 
order. These entries fill the remainder of Volume 1 and all of the subsequent volumes (2-5). 

To help you realize the full potential of the material in the Encyclopedia we have provided four features to help you find 
the topic of your choice: a contents list by subject, an alphabetical contents list, cross-references, and a full subject index. 


1. Contents List by Subject 


Your first point of reference will probably be the contents list by subject. This list appears at the front of each volume, 
and groups the entries under subject headings describing the broad themes of mathematical physics. This will enable the 
reader to make quick connections between entries and to locate the entry of interest. The contents list by subject is divided 
into two main sections: Physics Subjects and Related Mathematics Subjects. Under each main section heading, you will 
find several subject areas (such as GENERAL RELATIVITY in Physics Subjects or NONCOMMUTATIVE GEOMETRY 
in Related Mathematics Subjects). Under each subject area is a list of those entries that cover aspects of that subject, 
together with the volume and page numbers on which these entries may be found. 

Because mathematical physics is so highly interconnected, individual entries may appear under more than one subject 
area. For example, the entry GAUGE THEORY: MATHEMATICAL APPLICATIONS is listed under the Physics Subject 
GAUGE THEORY as well as in a broad range of Related Mathematics Subjects. 


2. Alphabetical Contents List 


The alphabetical contents list, which also appears at the front of each volume, lists the entries in the order in which they 
appear in the Encyclopedia. This list provides both the volume number and the page number of the entry. 

You will find “dummy entries” where obvious synonyms exist for entries or where we have grouped together related 
topics. Dummy entries appear in both the contents list and the body of the text. 


Example 
If you were attempting to locate material on path integral methods via the alphabetical contents list: 


PATH INTEGRAL METHODS see Functional Integration in Quantum Physics; Feynman Path Integrals 


The dummy entry directs you to two other entries in which path integral methods are covered. At the appropriate 
locations in the contents list, the volume and page numbers for these entries are given. 

If you were trying to locate the material by browsing through the text and you had looked up Path Integral Methods, 
then the following information would be provided in the dummy entry: 


Path Integral Methods see Functional Integration in Quantum Physics; Feynman Path Integrals 
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3. Cross-References 


All of the articles in the Encyclopedia have been extensively cross-referenced. The cross-references, which appear at the 
end of an entry, serve three different functions: 


i. To indicate if a topic is discussed in greater detail elsewhere. 
ii. To draw the reader’s attention to parallel discussions in other entries. 


iii. To indicate material that broadens the discussion. 


Example 
The following list of cross-references appears at the end of the entry STOCHASTIC HYDRODYNAMICS 


See also: Cauchy Problem for Burgers-Type Equations; Hamiltonian 
Fluid Dynamics; Incompressible Euler Equations: Mathematical Theory; 
Malliavin Calculus; Non-Newtonian Fluids; Partial Differential Equations: 
Some Examples; Stochastic Differential Equations; Turbulence Theories; 
Viscous Incompressible Fluids: Mathematical Theory; Vortex Dynamics 


Here you will find examples of all three functions of the cross-reference list: a topic discussed in greater detail elsewhere 
(e.g. Incompressible Euler Equations: Mathematical Theory), parallel discussion in other entries (e.g. Stochastic Differ- 
ential Equations) and reference to entries that broaden the discussion (e.g. Turbulence Theories). 

The eight Introductory Articles are not cross-referenced from any of the main entries, as it is expected that introductory 
articles will be of general interest. As mentioned above, the Introductory Articles may be found at the start of Volume 1. 


4. Index 


The index will provide you with the volume and page number where the material is located. The index entries 
differentiate between material that is a whole entry, is part of an entry, or is data presented in a figure or table. Detailed 
notes are provided on the opening page of the index. 
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A full list of contributors appears at the beginning of each volume. 
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Introduction 


Deformation quantization is an alternative way 
of looking at quantum mechanics. Some of its 
techniques were introduced by the pioneers of 
quantum mechanics, but it was first proposed as 
an autonomous theory in a paper in Annals of 
Physics (Bayen et al. 1978). More recent reviews 
treat modern developments (HH I 2001, Dito and 
Sternheimer 2002, Zachos 2002). 

Deformation quantization concentrates on the cen- 
tral physical concepts of quantum theory: the algebra 
of observables and their dynamical evolution. Because 
it deals exclusively with functions of phase-space 
variables, its conceptual break with classical mechanics 
is less severe than in other approaches. It formulates the 
correspondence principle very precisely which played 
such an important role in the historical development. 

Although this article deals mainly with nonrelati- 
vistic bosonic systems, deformation quantization is 
much more general. For inclusion of fermions and 
the Dirac equation see (Hirshfeld et al. 2002b). The 
fermionic degrees of freedom may, in special cases, be 
obtained from the bosonic ones by supersymmetric 
extension (Hirshfeld et al. 2004). For applications to 
field theory, see Hirshfeld et al. (2002). For the 
relation to Hopf algebras see Hirshfeld et al. (2003), 
and to geometric algebra, see Hirshfeld et al. (2005). 

The observables of a physical system, such as the 
Hamilton function, are smooth real-valued functions 
on phase space. Physical quantities of the system at 
some time, such as the energy, are calculated by 
evaluating the Hamilton function at the point 
Xo—(qo,po) in phase space that characterizes the 
state of the system at this time (we assume for the 
moment, a one-particle system). The mathematical 
expression for this operation is 


E= | Hana —qo,p—po)dqdp [1] 


where 62 is the two-dimensional Dirac delta 
function. The observables of the dynamical system 
are functions on the phase space, the states of the 
system are positive functionals on the observables 
(here the Dirac delta functions), and we obtain the 
value of the observable in a definite state by the 
operation shown in eqn [1]. 

In general, functions on a manifold are multiplied 
by each other in a pointwise manner, that is, given 
two functions f and g, their product fg is the 
function 


(fg) (x) = f (x)g(x) [2] 


In the context of classical mechanics, the observa- 
bles build a commutative algebra, called the com- 
mutative “classical algebra of observables.” 

In Hamiltonian mechanics there is another way to 
combine two functions on phase space in such a way 
that the result is again a function on the phase space, 
namely by using the Poisson bracket 


E n of Og r Of Og 
U.gi(q.p) ai > (ae Op; Op; | q.p 
" ie E öp) M [3] 


in an abbreviated notation. 

The notation can be further abbreviated by using x 
to represent points of the phase-space manifold, 
X — (x1 ,...,X24), and introducing the Poisson tensor 
a, where the indices i,j run from 1 to 2n. In 
canonical coordinates a” is represented by the matrix 


"m (2 * j4] 


where J, is the n x n identity matrix. Then eqn [3] 
becomes 


(Uf. g}(x) = o" dif (x) Og(x) [5] 


where ð; = 0/0x;. 
For a general observable, 


f = i Hj [6 


2 Deformation Quantization 


Because o transforms like a tensor with respect 
to coordinate transformations, eqn [5] may also be 
written in noncanonical coordinates. In this case 
the components of o need not be constants, and 
may depend on the point of the manifold at which 
they are evaluated. But in Hamiltonian mechanics, 
a is still required to be invertible. A manifold 
equipped with a Poisson tensor of this kind is 
called a symplectic manifold. In general, the tensor 
a is no longer required to be invertible, but it 
nevertheless suffices to define Poisson brackets via 
eqn [5], and these brackets are required to have 
the properties 


1. fg) = —ig.fh 
"A (f, gh) — (f, gb + eff, b), and 
3. (f.lg; 01 + tg, Uo. f} + Uo {f gh} — 0. 


Property (1) implies that the Poisson bracket is 
antisymmetric, property (2) is referred to as the Leibnitz 
rule, and property (3) is called the Jacobi identity. The 
Poisson bracket used in Hamiltonian mechanics satis- 
fies all these properties, but we now abstract these 
properties from the concrete prescription of eqn [3], and 
a Poisson manifold (M,a) is defined as a smooth 
manifold M equipped with a Poisson tensor o, whose 
components are no longer necessarily constant, such 
that the bracket defined by eqn [5] has the above 
properties. It turns out that such manifolds provide a 
better context for treating dynamical systems with 
symmetries. In fact, they are essential for treating gauge- 
field theories, which govern the fundamental interac- 
tions of elementary particles. 


Quantum Mechanics and Star Products 


The essential difference between classical and 
quantum mechanics is Heisenberg’s uncertainty 
relation, which implies that in the latter, states can 
no longer be represented as points in phase space. 
The uncertainty is a consequence of the noncommu- 
tativity of the quantum mechanical observables. 
That is, the commutative classical algebra of 
observables must be replaced by a noncommutative 
quantum algebra of observables. 

In the conventional approach to quantum 
mechanics, this noncommutativity is implemented 
by representing the quantum mechanical observables 
by linear operators in Hilbert space. Physical 
quantities are then represented by eigenvalues of 
these operators, and physical states are related to the 
operator eigenfunctions. Although these entities are 
somehow related to their classical counterparts, to 
which they are supposed to reduce in an appropriate 
limit, the precise relationship has remained obscure, 
one hundred years after the beginnings of quantum 


mechanics. Textbooks refer to the correspondence 
principle, which guided the pioneers of the subject. 
Attempts to give this idea a precise formulation by 
postulating a specific relation between the classical 
Poisson brackets of observables and the commu- 
tators of the corresponding quantum mechanical 
operators, as undertaken, for example, by Dirac and 
von Neumann, encountered insurmountable diffi- 
culties, as pointed out by Groenewold in 1946 in an 
unjustly neglected paper (Groenewold 1948). In the 
same paper Groenewold also wrote down the first 
explicit representation of a *star product" (see eqn 
[11]), without however realizing the potential of this 
concept for overcoming the difficulties that he 
wanted to resolve. 

In the deformation quantization approach, there 
is no such break when going from the classical 
system to the corresponding quantum system; we 
describe the quantum system by using the same 
entities that are used to describe the classical 
system. The observables of the system are described 
by the same functions on phase space as their 
classical counterparts. Uncertainty is realized by 
describing physical states as distributions on phase 
space that are not sharply localized, in contrast to 
the Dirac delta functions which occur in the 
classical case. When we evaluate an observable in 
some definite state according to the quantum 
analog of eqn [1] (see eqn [24]), values of the 
observable in a whole region contribute to the 
number that is obtained, which is thus an average 
value of the observable in the given state. Non- 
commutativity is incorporated by introducing a 
noncommutative product for functions on phase 
space, so that we get a new noncommutative 
quantum algebra of observables. The systematic 
work on deformation quantization stems from 
Gerstenhaber's seminal paper, where he introduced 
the concept of a star product of smooth functions 
on a manifold (Gerstenhaber 1964). 

For applications to quantum mechanics, we 
consider smooth complex-valued functions on a 
Poisson manifold. A star product f * g of two such 
functions is a new smooth function, which, in 
general, is described by an infinite power series: 


f * g = fg + (ib)Ci(f,g) + O(^) 


=) b)" C,(f, g) 7 
n=O 


The first term in the series is the pointwise product 
given in eqn [2], and (ib) is the deformation 
parameter, which is assumed to be varying con- 
tinuously. If 5 is identified with Planck's constant, 
then what varies is really the magnitude of the 


action of the dynamical system considered in units 
of h: the classical limit holds for systems with large 
action. In this limit, which we express here as 5 — 0, 
the star product reduces to the usual product. In 
general, the coefficients C, will be such that the new 
product is noncommutative, and we consider the 
noncommutative algebra formed from the functions 
with this new multiplication law as a deformation of 
the original commutative algebra, which uses point- 
wise multiplication of the functions. 

The expressions C,(f, g) denote functions made 
up of the derivatives of the functions f and g. It is 
obvious that without further restrictions of these 
coefficients, the star product is too arbitrary to be of 
any use. Gerstenhaber's discovery was that the 
simple requirement that the new product be asso- 
ciative imposes such strong requirements on the 
coefficients C, that they are essentially unique in 
the most important cases (up to an equivalence 
relation, as discussed below). Formally, Gerstenhaber 
required that the coefficients satisfy the following 
properties: 


l. 2 diss Cj(C,(f, g), b) = QT Cif C, (g, b)), 
2: Co(f, g) = fg, and 
3. Ci(f,g) — Cilg, f) = (f. gl. 


Property (1) guarantees that the star product is 
associative: (f *g)*h=f «(g*h). Property (2) means 
that in the limit b — 0, the star product f * g agrees 
with the pointwise product fg. Property (3) has at least 
two aspects: (i) mathematically, it anchors the new 
product to the given structure of the Poisson manifold 
and (ii) physically, it provides the connection between 
the classical and quantum behavior of the dynamical 
system. Define a commutator by using the new 
product: 


lg,—-f*g-g*f [8] 


Property (3) may then be written as 


"E 

lim = If. gl. = {f,8) 9] 
Equation [9] is the correct form of the correspon- 
dence principle. In general, the quantity on the left- 
hand side of eqn [9] reduces to the Poisson bracket 
only in the classical limit. The source of the 
mathematical difficulties that previous attempts to 
formulate the correspondence principle encoun- 
tered was related to trying to enforce equality 
between the Poisson bracket and the corresponding 
expression involving the quantum mechanical com- 
mutator. Equation [9] shows that such a relation in 
general only holds up to corrections of higher order 


in b. 


above list of properties. 
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For physical applications we usually require the 
star product to be Hermitean: f x g—g * f, where f 
denotes the complex conjugate of f. The star 
products considered in this article have this 
property. 

For a given Poisson manifold, it is not clear a 
priori if a star product for the smooth functions on 
the manifold actually exists, that is, whether it is at 
all possible to find coefficients C, that satisfy the 
Even if we find such 
coefficients, it it still not clear that the series they 
define through eqn [7] yields a smooth function. 
Mathematicians have worked hard to answer these 
questions in the general case. For flat Euclidian 
spaces, M — R^", a specific star product has long 
been known. In this case, the components of the 
Poisson tensor o/ can be taken to be constants. The 
coefficient C1 can then be chosen antisymmetric, 
so that 


Ci (f,g) = 30" (Of) (Og) = 5 (6.81 [10] 


by property (3) above. The higher-order coefficients 
may be obtained by exponentiation of Cj. This 
procedure yields the Moyal star product (Moyal 


1949): 
DM 
f * g — f exp| | > ] 00:0; Jg [11] 
In canonical coordinates, egn [11] becomes 


(f *u g)(q. b) 
NF TER M 
= fa.pex( 5 0,9, — Orda) Jelp) (12 


-E (2) Haas) ul 
mn=0 


We now come to the question of uniqueness of the 
star product on a given Poisson manifold. Two star 
products * and »' are said to be “c-equivalent” if 
there exists an invertible transition operator 


- Ly h"T, [14] 
n=0 


where the T, are differential operators that satisfy 


f *' g = T" (Tf) * (Tg) [15] 


It is known that for M — R" all admissible star 
products are c-equivalent to the Moyal product. The 
concept of c-equivalence is a mathematical one 
(c stands for cohomology (Gerstenhaber 1964)); it 
does not by itself imply any kind of physical 
equivalence, as shown below. 


T=1+bTi +- 
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Another expression for the Moyal product is a 
kind of Fourier representation: 


(f *u g)(q.p) 


d 1 
b> 12 


nz dq» dpi dp2f (q1, p1)g(q2. p2) 
f 
xexp | (pias — a2) + (pa — Pr) 


+ (qapi 一 ai) [16] 


Equation [16] has an interesting geometrical inter- 
pretation. Denote points in phase space by vectors, 
for example, in two dimensions: 


的 =) oem 


Now, consider the triangle in phase space spanned 
by the vectors r — rı and r — r2. Its area (symplectic 
volume) is 


A(r, r1, r2) 
—i(r— r1) ^ (r — r2) 
= 3p(qz —qi)-q(p1—pa) + (qıp2 — q2p1)) [18] 


which is proportional to the exponent in eqn [16]. 
Hence, we may rewrite eqn [16] as 


(f * g)(r) 
4i 
= | dr draf (r1)g(r2) exp p Ars nra) [19] 
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The properties of the star product are well adapted 
for describing the noncommutative quantum algebra 
of observables. We have already discussed the 
associativity and the incorporation of the classical 
and semiclassical limits. Note that the characteristic 
nonlocality feature of quantum mechanics is also 
explicit. In the expression for the Moyal product 
given in eqn [13], the star product of the functions f 
and g at the point x —(q, p) involves not only the 
values of the functions f and g at this point, but also 
all higher derivatives of these functions at x. But for 
a smooth function, knowledge of all the derivatives 
at a given point is equivalent to the knowledge of 
the function on the entire space. In the integral 
expression of eqn [16], we also see that knowledge 
of the functions f and g on the whole phase space is 
necessary to determine the value of the star product 
at the point x. 


The c-equivalent star products correspond to differ- 
ent quantization schemes. Having chosen a quantiza- 
tion scheme, the quantities of interest for the quantum 
system may be calculated. It turns out that different 
quantization schemes lead to different spectra for the 
observables. The choice of a specific quantization 
scheme can only be motivated by further physical 
requirements. In the simple example we discuss below, 
the classical system is completely specified by its 
Hamilton function. In more general cases, one may 
have to decide what constitutes a sufficiently large set 
of good observables for a complete specification of the 
system (Bayen et al. 1978). 

A state is characterized by its energy E; the set 
of all possible values for the energy is called the 
spectrum of the system. The states are described 
by distributions on phase space called projectors. 
The state corresponding to the energy E is 
denoted by  zr(q,p). These distributions are 
normalized: 


1 
zz) "E(q, p)dq dp = 1 [20] 
and idempotent: 


(TE * vE)(q. p) = OF. TE(G, p) [21] 


The fact that the Hamilton function takes the value 
E when the system is in the state corresponding to 
this energy is expressed by the equation 


(H * ng)(q, p) = Erelq, p) [22] 


Equation [22] corresponds to the time-independent 
Schrödinger equation, and is sometimes called the 
“x-genvalue equation.” The spectral decomposition 
of the Hamilton function is given by 


H(q, p) = 》 Ere(ą, p) [23] 
E 


where the summation sign may indicate an integra- 
tion if the spectrum is continuous. The quantum 
mechanical version of eqn [1] is 


1 
E = 55 [He rE)(q, p)dq dp 
mh J 
il H(q,p) vz(q. p)dq dp [24] 


where the last expression may be obtained by using 
eqn [16] for the star product. 

The time-evolution function for a time-indepen- 
dent Hamilton function is denoted by Exp(Ht), and 
the fact that the Hamilton function is the generator 
of the time evolution of the system is expressed by 


ib Exp(Ht) = H x Exp(Ht) [25] 


This equation corresponds to the time-dependent 
Schrödinger equation. It is solved by the star 
exponential: 


Exp(Ht) = 5 (=) (Hx)" [26] 
« n! 


n= 


where (H x )"= H* H*---*H. Because each state 
nmm, umm 


n times 
of definite energy E has a time evolution exp (iEt/b), 
the complete time-evolution function may be written 
in the form 


Exp(Ht) e Ea/ [27] 


This expression is called the ‘“Fourier—Dirichlet 
expansion" for the time-evolution function. 

Questions concerning the existence and unique- 
ness of the star exponential as a C* function and the 
nature of the spectrum and the projectors again 
require careful mathematical analysis. The problem 
of finding general conditions on the Hamilton 
function H which ensure a reasonable physical 
spectrum is analogous to the problem of showing, 
in the conventional approach, that the symmetric 
operator H is self-adjoint and finding its spectral 
projections. 


The Simple Harmonic Oscillator 


As an example of the above procedure, we treat the 
simple one-dimensional harmonic oscillator charac- 
terized by the classical Hamilton function 


H(q.p) 25—t—- F [28] 


— [29] 
acr li) 
the Hamilton function becomes 
H = waa [30] 


Our aim is to calculate the time-evolution function. 
We first choose a quantization scheme characterized 
by the normal star product 


f ¥, g — fer [31] 
we then have 


k a= dd, ax; āã=4ā +h [32] 
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[可 一 33] 


Equation [25] for this case is 


ad 
ib i Expy (Ht) = 


with the solution 


(H + hwads)Expy(Ht) [34] 


Exp, (Ht) = e ^^ exp(e “‘aa/b) [35] 


By expanding the last exponential in eqn [35], we 


obtain the Fourier—Dirichlet expansion 


Expy (Ht) = e aab 23 € e imut [36] 


From here, we can read off the energy eigenvalues 
and the projectors describing the states by compar- 
ing coefficients in eqns [27] and [36]: 


ni) = e 44/b [37] 
aN) = pim "eu a” *. n kg [38] 
E, = nhw [39] 


Note that the spectrum obtained in eqn [39] does 
not include the zero-point energy. The projector 
onto the ground state ny ! satisfies 


ax, n5 = [40] 

The spectral decomposition of the Hamilton func- 
tion (eqn [23]) is in this case 

H= 2. nhw (ae tara") — waa [41] 


We now consider the Moyal quantization scheme. 
If we write eqn [12] in terms of holomorphic 
coordinates, we obtain 


f *u 8 = fexp (3( (8,0; — 2,0) )s [42] 
Here, we have 

an, a=aa+>, às a=aa—5 [43] 
and again 


|a, a), =b [44] 


The value of the commutator of two phase-space 
variables is fixed by property (3) of the star product, 
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and cannot change when one goes to a c-equivalent 
star product. The Moyal star product is c-equivalent to 
the normal star product with the transition operator 


T 2e 0/22. [45] 


We can use this operator to transform the normal 
product version of the »-genvalue equation, eqn [22], 
into the corresponding Moyal product version 
according to eqn [15]. The result is 


b 
H x, TAM) =w (2*5) x, TM) 


= b(n +t ins" 46 
with 
Ali WP. Rana [47] 
a) = TrM = ET ky nu) Id [48] 
The projector onto the ground state qi ! satisfies 
axy 7 =0 [49] 
We now have, for the spectrum, 
E, = (n +4) bw [50] 


which is the textbook result. We conclude that for 
this problem, the Moyal quantization scheme is the 
correct one. 

The use of the Moyal product in eqn [25] for the 
star exponential of the harmonic oscillator leads to 
the following differential equation for the time 
evolution function: 


oe wd 


= (a- (hy 5 ML 


E OO Ho Expl He [51] 


The solution is 


Exp,,(Ht) = a [的 tan 的 | [52] 


This expression can be brought into the form of the 


Fourier-Dirichlet expansion of eqn [27] by using - 


the generating function for the 


polynomials: 


FT ren > *(- 53 


with s —e-'^*, The projectors then become 


iM) = 2(- 1)"e ma bad (E) [54] 


Laguerre 


which is equivalent to the expression already found 
in eqn [48]. 


Conventional Quantization 


One usually finds the observables characterizing 
some quantum mechanical system by starting from 
the corresponding classical system, and then, either 
by guessing or by using some more or less systematic 
method, and finding the corresponding representa- 
tions of the classical quantities in the quantum 
system. The guiding principle is the correspondence 
principle: the quantum mechanical relations are 
supposed to reduce somehow to the classical 
relations in an appropriate limit. Early attempts to 
systematize this procedure involved finding an 
assignment rule © that associates to each phase- 
space function f a linear operator in Hilbert space 
f — O(f) in such a way that in the limit 5 — 0, the 
quantum mechanical equations of motion go over to 
the classical equations. Such an assignment cannot 
be unique, because even though an operator that is a 
function of the basic operators O and P reduces to a 
unique phase-space function in the limit 5 — 0, 
there are many ways to assign an operator to a given 
phase-space function, due to the different orderings 
of the operators O and P that all reduce to the 
original phase-space function. Different ordering 
procedures correspond to different quantization 
schemes. It turns out that there is no quantization 
scheme for systems with observables that depend on 
the coordinates or the momenta to a higher power 
than quadratic which leads to a correspondence 
between the quantum mechanical and the classical 
equations of motion, and which simultaneously 
strictly maintains the Dirac-von Neumann require- 
ment that (1/ib)[f,$] — (f,g]. Only within the 
framework of deformation quantization does the 
correspondence principle acquire a precise meaning. 

A general scheme for associating phase-space 
functions and Hilbert space operators, which 
includes all of the usual orderings, is given as 
follows: the operator ©)(f) corresponding to a 
given phase-space function f is 


= [ie ne £P eX& de dy [55] 


where f is the Fourier transform of f, and (Ô, P) are the 
Schrödinger operators that correspond to the phase- 
space variables (q, p); A(€, n) is a quadratic form: 


b 
AlE n) = 4 (om + BE? + 2iyén) [56] 


Different choices M the constants (o,58,^5) yield 
different operator ordering schemes. 


The relation between operator algebras and star 
products is given by 


e(f)e(g) = e(f * g) [57] 


where 9 is a linear assignment of the kind discussed 
above. Different assignments, which correspond to 
different operator orderings, correspond to c-equiva- 
lent star products. It demonstrates that the quantum 
mechanical algebra of observables is a representa- 
tion of the star product algebra. Because in the 
algebraic approach to quantum theory all the 
information concerning the quantum system may 
be extracted from the algebra of observables, 
specifying the star product completely determines 
the quantum system. 

The inverse procedure of finding the phase-space 
function that corresponds to a given operator f is, 
for the special case of Weyl ordering, given by 


f(q.p) = J (q + 1élflg — tee de — [58] 


When using holonomic coordinates, it is convenient 
to work with the coherent states 


ala) =ala), (ala! = (ala [59] 


These states are related to the energy eigenstates of 
the harmonic oscillator 


|n) = igr 0) [60] 
n! 
by 
15 = a” 
PET Eai |n), 
2, Vai [61] 
a| 2e va Já n 
(a va” 


In normal ordering, we obtain the phase space function 
f (a, a) corresponding to the operator f by just taking 
the matrix element between coherent states: 


f(a,a) = (alf (a, al)la) (62) 


For holomorphic coordinates, it is easy to show 
nN) (a,ā) = gs (aln) (nla) = zs (aa)"e ^ [63 


in agreement with eqn [38] for the normal star 
product projectors. 

The star exponential Exp(Ht) and the projectors 
7, are the phase-space representations of the time- 
evolution operator exp (—iHt/h) and the projection 
operators pn=|n)(n|, respectively. Weyl ordering 
corresponds to the use of the Moyal star product for 
quantization and normal ordering to the use of the 
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normal star product. In the density matrix formal- 
ism, we say that the projection operator is that of a 
pure state, which is characterized by the property of 
being idempotent: 9? = 0, (compare eqn [21]). The 
integral of the projector over the momentum gives 
the probability distribution in position space: 


an | 7 
1 —i£p/b 
= 5 | (a+ €/21n)(mlq — £2) Pac dp 
= (q|n)(n\q) = lisa) [64] 


and the integral over the position gives the prob- 
e distribution in momentum space: 


aah ;; | 71^ (q,p)dq = (pln) (nlp) = \vn(p) — [65] 
The normalization is 
z; | 7 (q,p)dq dp = 1 [66] 


which is the same as eqn [20]. Applying these 
relations to the ground-state projector of the 
harmonic oscillator, eqn [47] shows that this is a 
minimum-uncertainty state. In the classical limit 
b-—0, it goes to a Dirac 6-function. The expecta- 
tion value of the Hamiltonian operator is 


1 f 
E (M) = ^ 
> J (H *, 75 )(4, p)dq dp J (q|Hpn|q)dq 
= tr(Hp,) [67] 
which should be compared to eqn [24]. 


Quantum Field Theory 


A real scalar field is given in terms of the coefficients 


a(k), a(k) by 


3 
(x) = | a UES age 十 a(k)e"**| [68] 


where buy = v b^ k^ + m? is the energy of a single- 
quantum of the field. The corresponding quantum 
field operator is 


3 
D(x) =| age- 十 al (k)e'**] [69] 


where 4(k),4'(k) are the annihilation and creation 
operators for a quantum of the field with momen- 
tum bk. The Hamiltonian is 


H = / d? buy ait (kjak) (70) 
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N(k) = á' (k)á(k) is interpreted as the number opera- 
tor, and eqn [70] is then just the generalization of 
eqn [39], the expression for the energy of the harmonic 
oscillator in the normal ordering scheme, for an infinite 
number of degrees of freedom. Had we chosen the 
Weyl-ordering scheme, it would have resulted in (by 
the generalization of eqn [50]) an infinite vacuum 
energy. Hence, requiring the vacuum energy to vanish 
implies the choice of the normal ordering scheme in 
free field theory. In the framework of deformation 
quantization, this requirement leads to the choice of 
the normal star product for treating free scalar fields: 
only for this choice is the star product well defined. 

Currently, in realistic physical field theories 
involving interacting relativistic fields we are limited 
to perturbative calculations. The objects of interest 
are products of the fields. The analog of the Moyal 
product of eqn [11] for systems with an infinite 
number of degrees of freedom is 


ó(x1) * (x2) * <- - * O(Xn) 
BEN 4 a ó x 8 
as Ex f ON Gay” mc 
x O1(X1),---,On(%n)|4—<¢ [71] 


where the expressions 6/óó(x) indicate functional 
derivatives. Here, we have used the antisymmetric 
Schwinger function: 


A(x — y) = [®(x), $(y)] [72] 


The Schwinger function is uniquely determined by 
relativistic invariance and causality from the equal- 
time commutator 


[B(x), $(y)]|, ,,— BE (xy) [73] 


which is the characterization of the canonical 
structure in the field theoretic framework. 

The Moyal product is, however, not the suitable 
star product to use in this context. In relativistic 
quantum field theory, it is necessary to incorporate 
causality in the form advocated by Feynman: 
positive frequencies propagate forward in time, 
whereas negative frequencies propagate backwards 
in time. This property is achieved by using the 
Feynman propagator: 


A*(x) forx? >0 
Ar(x) = d [74] 
-A (x) for x® <0 


where A*(x), A (x) are the propagators for the 
positive and negative frequency components of the 
field, respectively. In operator language 


A(x — y) = T(®(x) ®(y)) - N(®(x)®(y)) [75] 


where 7 indicates the time-ordered product of the 
fields and N the normal-ordered product. Because the 
second term in eqn [75] is a normal-ordered product 
with vanishing vacuum expectation value, the Feyn- 
man propagator may be simply characterized as the 
vacuum expectation value of the time-ordered product 
of the fields. The antisymmetric part of the positive 
frequency propagator is the Schwinger function: 


At (x) — At(—x) = At (x) +A (x) = A(x) [76] 


The fact that going over to a c-equivalent product 
leaves the antisymmetric part of the differential 
operator in the exponent of eqn |71] invariant suggests 
that the use of the positive frequency propagator 
instead of the Schwinger function merely involves the 
passage to a c-equivalent star product. This is indeed 
easy to verify. The time-ordered product of the 
operators is obtained by replacing the Schwinger 
function A(x — y) in eqn [72] by the c-equivalent 
positive frequency propagator A*(x — y), restricting 
the time integration to x? > y?, as in eqn [74], and 
symmetrizing the integral in the variables x and y, 
which brings in the negative frequency propagator 
A (x — y) for times x? < y?. Then eqn [71] becomes 
Wick's theorem, which is the basic tool of relativistic 
perturbation theory. In operator language 


x N(®(x1),..., (x,)) [77] 


Another interesting relation between deformation 
quantization and quantum field theory has been 
uncovered by studies of the Poisson-Sigma model. 
This model involves a set of scalar fields X’ which map a 
two-dimensional manifold 3:5 onto a Poisson space M, 
as well as generalized gauge fields A;, which are 1-forms 
on X mapping to 1-forms on M. The action is given by 


Sps — | (AjdX' + a A;A;) [78] 


where a’ is the Poisson structure of M. A remark- 
able formula was found (Cattaneo and Felder 2000): 


(f * g)(x) = / DXDAf(X(1)g(X(2)e?/^ [79] 


where f, g are functions on M,» is Kontsevich's star 
product (Kontsevich 1997), and the functional integra- 
tion is over all fields X that satisfy the boundary 
condition X(oc) = x. Here X» is taken to be a disk in R^; 
1, 2, and œ are three points on its circumference. By 
expanding the functional integral in eqn [79] according 
to the usual rules of perturbation theory, one finds that 
the coefficients of the powers of h reproduce the graphs 
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and weights that characterize Kontsevich’s star pro- 
duct. For the case in which the Poisson tensor is 
invertible, we can perform the Gaussian integration in 
eqn [79] involving the fields A;. The result is 


(f * g)(x) 
n / DXf (X(1))g(X(2)) exp ; | ondxdxi [80] 


Equation [80] is formally similar to eqn [16] for the 
Moyal product, to which the Kontsevich product 
reduces in the symplectic case. Here 0; = (o ) ! is the 
symplectic 2-form, and f Q;dX' dX’ is the symplectic 
volume of the manifold M. To make this relationship 
exact, one must integrate out the gauge degrees of 
freedom in the functional integral in eqn [79]. Since the 
Poisson-sigma. model represents a topological field 
theory there remains only a finite-dimensional inte- 
gral, which coincides with the integral in eqn [80]. 


See also: Deformations of the Poisson Bracket on a 
Symplectic Manifold; Deformation Quantization and 
Representation Theory; Deformation Theory; Fedosov 
Quantization; Noncommutative Geometry from Strings; 
Operads; Quantum Field Theory: A Brief Introduction; 
Schródinger Operators. 
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The Quantization Problem 


Though quantum theory for the classical phase space 
R” is well established by means of what usually is 
called canonical quantization, physics demands to go 
beyond R”: On the one hand, systems with constraints 
lead by phase-space reduction to classical phase spaces 
different from R”; in general one ends up with a 
symplectic or even Poisson manifold. Thus, one needs 
to quantize geometrically nontrivial phase spaces. On 
the other hand, field theories and thermodynamical 
systems require to pass from R^" to infinitely many 
degrees of freedom, where one faces additional 
analytical difficulties. Both types of difficulties combine 


for gauge field theories and gravity, whence it is clear 
that quantization is still one of the most important 
issues in mathematical physics. 

One possibility (among many others) is to use the 
structural similarity between the classical and 
quantum observable algebras. In both cases the 
observables constitute a complex »-algebra: in the 
classical case it is commutative with the additional 
structure of a Poisson bracket, whereas in the 
quantum case the algebra is noncommutative. In 
deformation quantization, one tries to pass from the 
classical observables to the quantum observables by 
a deformation of the algebraic structures. 


From Canonical Quantization to Star 
Products 


Let us briefly recall canonical quantization and the 
ordering problem. In order to *quantize" classical 
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observables like the polynomials on R” to q^, pj, 
one assigns the operators 


q* — e(q*) = Q* = (qr q'v(q)) [1] 


pi e(pi) = P, = (a= - iaa 2) [2| 


for k,/=1,...,, defined on a suitable domain in 
L^(R",d"q). For simplicity, we choose C; (R") as 
domain. The well-known ordering problem is 
encountered if one wants to also quantize higher 
polynomials. One convenient (although not the only) 
possibility is Weyl's total symmetrization rule, that is, 
for a monomial like q?p we take the quantization 


owenl(q p) = 3(Q'P + QPO + PQ’) 
0 
TX 2 i AME. 
= —ibq ag ibq [3] 
This can be written in the more explicit form: 
OWeyl (f) 
È 2» 1c aus ME A 
r! Opi, *. OPi,|\p- 0q^ =- Oq" 
with 
b 9? 
N -exp( A) and A= Op; 


Using [4] one can easily extend OWeyl tO all functions 
f € C*(R?") which are polynomial in the momentum 
variables only and have an arbitrary smooth depen- 
dence on the position variables. This Poisson sub- 
algebra of C*(R^") certainly covers all classical 
observables of physical interest. Denoting these obser- 
vables by Pol(T* R"), one obtains a linear isomorphism 


Oweyl : Pol(T*R") — Diffop(R") [5] 


into the differential operators with smooth coeffi- 
cients, called Weyl symbol calculus. Other orderings 
would result in a different linear isomorphism like 
[5], for example, the standard ordering is obtained 
by simply omitting the operator N in [4]. 

Using [5], one can pull back the operator product 


of Diffop(R") to obtain a new product *wey for, 


Pol( T*R"), that is 


f *Weyi E Cweyl (pweyl(f )Oweyl (g)) 6] 


which is called the Weyl-Moyal star product. 
Explicitly, one has 


f *Weyl 8 


ibi Oo 0 o o 
-noep 区 p p, ^ gk Je) \fes [7] 


where j(f ® g)=fg is the commutative product. 
Clearly, for f,g € Pol(T*R") the exponential series 
terminates after finitely many terms. If one now 
wants to extend further to all smooth functions, 
then [7] is only a formal power series in 5. Since on 
a manifold one does not have a priori a nice 
distinguished class of functions like Pol(T*R"), one 
indeed has to generalize in this direction if a 
geometric framework is desired. This observation 
and the simple fact, that *weyl satisfies all the 
following properties, lead to the definition of a 
formal star product by Bayen et al. (1978): 


Definition 1 A formal star product on a Poisson 
manifold (M,z) is an associative C[[A]]-bilinear 
product 


fxg= SVG (fig) [8] 
r=0 


for f,g € C*(M)[[A]] such that 


1. Co(f,g) ^ fg and Ci(f,g) — Ci(g, f) —ilf, 83， 
2. Lef =F=f «1, and 
3. C, is a bidifferential operator. 


If in addition f«g-—g»f, then » is called 


Hermitian. 


Clearly, *wey; defines a Hermitian star product for 
R^", The first condition is called the correspondence 
principle in deformation quantization and the for- 
mal parameter A— A corresponds to Planck's con- 
stant $ once a convergence scheme is established. 

If S—id -- 357 , XS, is a formal series of differ- 
ential operators with S,1 — 0 for r > 1, then it is easy 
to see that 


f x’ g = S (Sf x Sg) [9] 


defines again a star product which is Hermitian if x is 
Hermitian and if in addition Sf = Sf. In particular, the 
operator N, as before, serves for the transition from 
*Wey to the standard-ordered star product «4 obtained 
the same way from the standard-ordered quantization. 
Thus, [9] can be seen as the abstract notion of changing 
the ordering prescription, even if no operator repre- 
sentation has been specified. Two star products related 
by such an equivalence transformation are called 
equivalent and *-equivalent in the Hermitian case. 

One main advantage of formal deformation 
quantization is that one has very strong existence 
and classification results: 


Theorem 2 On every Poisson manifold there exists 
a star product. 


The above theorem was first shown by deWilde 
and Lecomte (1983) for the symplectic case and 
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independently by Fedosov (1985) and Omori, 
Maeda, and Yoshioka (1991). In 1997, Kontsevich 
was able to prove the general Poisson case by 
showing his profound formality theorem. The full 
classification of star products up to equivalence was 
first obtained for the symplectic case by Nest and 
Tsygan (1995) and independently by  Deligne 
(1995), Bertelson, Cahen, and Gutt (1997), and 
Weinstein and Xu (1997). The general Poisson case 
again follows from  Kontsevich's formality. In 
particular, in the symplectic case, star products are 
classified by their characteristic class 


Ccixec(x)€ tal + Hierham ( 


T M, C)[[A]] [10] 


As conclusion one can state that for the price of 
formal power series in b one obtains in formal 
deformation quantization a very general and well- 
understood picture of the observable algebra for the 
quantum version of any classical system described 
kinematically by a Poisson manifold. It turns out 
that already in this framework one can discuss 
dynamics as well by use of a Heisenberg equation 
formulated with «. Moreover, the quantization of 
symmetries described by Hamiltonian Lie group or 
Lie algebra actions has been extensively studied. 

For a physical theory of quantization, however, 
there are still at least two ingredients missing. On 
the one hand, one has to overcome the formal 
power series expansion in 5. This problem is, in 
principle, on the same footing as any perturbative 
approach to quantum theory and thus no easy 
answer can be expected to hold in general. In 
particular examples, however, such as the Weyl- 
Moyal star product, it can easily be solved. These 
issues together with the corresponding questions 
about a spectral calculus are best studied in the 
framework of Rieffel's strict deformation quantiza- 
tion based on a more C*-algebraic formulation of 
the deformation problem. On the other hand, the 
observable algebra is not enough to describe a 
quantum system: one also needs to have a notion 
for the states. It turns out that already in the formal 
framework one has a physically reasonable notion 
of states as discussed by Bordemann and Wald- 
mann (1998). 


States and Representations 


The notion of states in deformation quantization 
is adapted from the C*-algebraic world and based 
on the notion of positive functionals. Recall that 
for a *-algebra A over C a linear functional 


w:.A— C is called positive if w(a*a) > 0. For 
formal deformation quantization, things are 
slightly more subtle as now one has to consider 
C[[A]]-linear functionals 


C* (M)[IAJ], =) — CIM) [11] 
where » is assumed to be a Hermitian star product 
in the following. Then the positivity is understood in 
the sense of formal power series where a € R[[A]] is 
called positive if a= » ^ ,, Aa, with a,, > 0. Thus, 
we can make sense out of the following 
requirement: 


Definition 3 Let » be a Hermitian star product on 
M. A C[[A|]-Hinear functional &: C?*(M)[[A]] 一 
C[[A]] is called positive with respect to x if 


w(f xf) > 0 [12] 
and it is called a state if, in addition, w(1)=1. 


In fact, w(f) is interpreted as the expectation value 
of the observable f in the state w. The positivity [12] 
ensures that the usual uncertainty relations between 
expectation values hold. 

Sometimes it is convenient to consider positive 
functionals only defined on a (proper) -ideal in 

M)I[A]], for instance, Ce (M)[[A]]. 

Since in some situations one wants more general 
formal series than just power series, it is conve- 
nient to embed the above definition of states into a 
larger and more algebraic context: consider an 
ordered ring R, that is, a commutative, associative, 
unital ring R together with a distinguished subset 
PCR (the positive elements) such that R is the 
disjoint union —PU(0JUP, and we have P-P CP 
and P+PCP. Then C=R(i) denotes the ring 
extension by a square root i of —1 and consider 
*-algebras A over C. Clearly, this generalizes the 
cases R=R, where C=C, as well as R=RI[A]], 
where C=C{[A]]. In this way, one provides a 
framework where C*-algebras, *-algebras over C, 
and formal Hermitian star products can be treated 
on the same footing. It is clear that the definition 
of a positive functional immediately extends to 
w:A-— C for such a ring C. 


Example 4 


(1) For the Wick star product on R?” œ C". defined 
by 


(2A) Of “ 20 
f *wickg-— ora rl Ogh.. Oz -.-Oz* OF... gg - Og [13] 


the ó-functional ó:f—f(0) is positive. 
however, that 6 is not positive for xweyi- 


Note, 
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(ii) For the Weyl-Moyal star product xw. the 
Schródinger functional 


«f = | flap = dg 14] 


defined on the »-ideal C? (R^")([A]], is positive. 

(iii) For any connected symplectic manifold (M, w) 
and any Hermitian star product x, there exists a 
unique normalized trace functional 


tr : Co (M)|[A]] > CN 
tr(f x g) — tr(g * f) 


with zeroth order equal to the integration over 
M with respect to the Liouville measure €) — w". 
Then this trace is positive as well, tr(f xf) > 0. 


[15] 


Having a notion for states as expectation-value 
functionals is still not enough to formulate quantum 
theory. One main feature of quantum states, the 
superposition principle, is not yet implemented. In 
particular, forming convex combinations like 
w=ciw1 +¢2W2, with clycz > 0 and cy +c2.=1, 
does not give a superposition of wl and wz but 
a mixed stated. Hence, one needs an additional 
linear structure on the states whence we look for a 
*-representation 7 of the observable algebra A on a 
pre-Hilbert space H over C such that the states 
wiw? can be written as vector states w;(a)= 
(hi, tla)ġi) for some unit vectors $1, 9» € H. Then 
one can build superpositions of the vectors $1, 2 in 
the usual way. While this is the well-known 
argument in any quantum theory based on the 
observable algebras, for deformation quantization 
one first has to make sense out of the above notions, 
since now R— R[[A]] is only an ordered ring. This 
can actually be done in a consistent way as 
demonstrated and exemplified by Bordemann, 
Bursztyn, Waldmann, and others. 

We recall the basic results: A pre-Hilbert space H 
over C is a C-module with a C-sesquilinear inner 
product (-,-):# x 4 — C such that ($,v) = (v, 9) 
and ($0, 9) > 0 for $ Z 0. This makes sense since R is 
ordered. An operator A: Hi1 一 Tt is called adjointa- 
ble if there exists an operator A* : H2 — Hı such that 
(AQ, V), = (h, A* Y) for all $ € Hi, v € H2. The set 
of adjointable operators is denoted by 3B(711, H2), and 
B(H) = 38H, H) turns out to be a *-algebra over C. 
This allows one to define a *-representation 7 of A on 
H to be a »x-homomorphism 7:.A — *B(A). An 
intertwiner T between two *-representations (714,71) 
and (7453,72) is an operator T € B(Hı, H2) with 
Tm«i(a)— -2(a)T for all a€ A. This defines the 
category *-Rep(.A) of «-representations of A. 

Let us now recall that a positive linear functional 
w can be written as an expectation value for a vector 


state in some representation. This is the well-known 
Gelfand-Naimark-Segal (GNS) construction from 
operator algebra theory which can be transferred to 
this purely algebraic context (Bordemann and 
Waldmann 1998). First recall that any positive 
linear functional w:.A— C satisfies the Cauchy- 
Schwarz inequality 


w(a*b) w(a*b) € w(a*a)w(b*b) [16] 
and w(a*b) =u(b*a). If A is unital, which will always 


be assumed for simplicity, then w(a*) = w(a) follows. 


Then 
Jo = {a € A|w(a*a) = 0} [17] 


is a left ideal in A, the so-called Gel’fand ideal, and 
hence H,,=A/7,, is a left A-module with module 
structure denoted by 7,(a)ws = Wap, where W E Hu 
denotes the equivalence class of b€ A. Finally, 
(bp, Ye) =w(b*c) turns Hy into a pre-Hilbert space 
and m, becomes a *-representation, the GNS repre- 
sentation with respect to w. Moreover, w € Hoe is a 
cyclic vector, vy = c,(b)v, with the property 


w(a) = Qi, Tula) p) [18] 


These properties characterize the GNS representa- 
tion (Hu, Tw, %1) up to unitary equivalence. 


Example 5 We can now apply this construction to 
the three basic examples and obtain the following 
well-known representations as GNS representations: 


(i) The GNS representation corresponding to the 
6-functional and the Wick star product is 
(unitarily equivalent to) the formal 
Bargmann-Fock representation. Here 
Hs — C[ly* ...,y"]H[A]] with inner product 


Ks) [19] 


and 7; is explicitly given by 


e (2A)' Qr 
= — ~ (0 
"4 (f) 2. rls! OZ) -ozr oZ! ..- Os ( ) 
| | ar 
x zl - ead aps — € 20 
y Ya. y [20] 


In particular, 7;(2) -240/0y! and m;(z')— y 
are the annihilation and creation operators 
and [20] gives the Wick (or normal) ordering. 
This basic example has been extended to 
arbitrary Kahler manifolds by Bordemann and 
Waldmann (1998). 
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(ii) The Weyl-Moyal star product xwey! and the 
Schródinger functional w as in [14] give the 
usual Schródinger representation as GNS repre- 
sentation. We obtain 74, — CE (R')[[A]] with 
inner product 


(ou) = | (va) d'a 21 


and Tolf) = Oweyi(f) as in [4] with 5 replaced 
by à. The Schrödinger representation as a 
particular case of a GNS representation has 
been generalized to arbitrary cotangent bundles 
including representations on sections of line 
bundles over the configuration space (Dirac’s 
representation for magnetic monopoles) by 
Bordemann, Neumaier, Pflaum, and Waldmann 
(1999, 2003). In this context, the WKB expan- 
sion can also be formulated. 

For the positive trace tr, the GNS pre-Hilbert is 
simply the space Hy=CF(M)[[A]] with inner 
product (f, g) =tr(f * g). The corresponding GNS 
representation is the left regular representation 
mw(f)g =f * g. Note that in this case the commu- 
tant of the representation is (anti-)isomorphic to 
the observable algebra and given by all the right 
multiplications. Thus, 7; is highly reducible and 
the size of the commutant indicates a “thermo- 
dynamical" interpretation of this representation. 
Indeed, one can take this GNS representation, and 
more general for arbitrary KMS functionals, as a 
starting point of a preliminary version of a 
Tomita-Takesaki theory for deformation quanti- 
zation as shown by Waldmann (1999). 


— 


(iii 


After these fundamental examples, we now recon- 
sider the question of superpositions: in general, two 
(pure) states w1,w2 cannot be realized as vector 
states inside a single irreducible representation. One 
encounters superselection rules. Usually, for 
instance, in algebraic quantum field- theory, the 
existence of superselection rules indicates the pre- 
sence of charges. In particular, it is not sufficient to 
consider one single representation of the observable 
algebra A. Instead, one has to investigate (as good 
as possible) all superselection sectors of the repre- 
sentation theory *-Rep(A) of A and find physically 
motivated criteria to select distinguished representa- 
tions. In usual quantum mechanics on R^", this 
turns out to be rather simple, thanks to the 
(nontrivial) uniqueness theorem of von Neumann: 
one has a unique irreducible representation of the 
Weyl algebra up to unitary equivalence. In infinite 
dimensions or in topologically nontrivial situations, 
however, von Neumann's theorem does not apply 
and one indeed has superselection rules. 


In deformation quantization, some parts of these 
superselection rules have been understood well: 
again, for cotangent bundles T*O, one can classify 
the unitary equivalence classes of Schródinger-like 
representations on C7 (O)[[A]] by topological classes 
of nontrivial vector potentials. Thus, one arrives at 
the interpretation of the Aharonov-Bohm effect as 
superselection rule where the classification is essen- 


tially given by Hl gpam(Q, C)/2zi Higgs (Q, Z). 


General Representation Theory 


Although it is very much desirable to determine the 
structure and the superselection sectors in *-Rep(A) 
completely, this is only achievable in the very 
simplest examples. Moreover, for formal star pro- 
ducts, many artifacts due to the purely algebraic 
nature have to be expected: the Bargmann-Fock and 
Schródinger representation in Example 5 are uni- 
tarily inequivalent and thus define a superselection 
rule, even the pre-Hilbert spaces are nonisomorphic. 
However, these artifacts vanish immediately when 
one imposes the suitable convergence conditions 
together with appropriate topological completions 
(von Neumanns's theorem). Given such problems, it 
is very difficult to find *hard" superselection rules 
which indeed have physical significance already at 
the formal level. Nevertheless, the example of the 
Aharonov-Bohm effect shows that this is possible. 
In any case, new techniques for investigating 
*-Rep(A) have to be developed. It turns out that 
comparing *-Rep(.A) with some other +-Rep(B) is 
much simpler but still gives some nontrivial insight 
in the structure of the representation theory. Here 
the Morita theory provides a highly sophisticated 
tool. 

The classical notion of Morita equivalence as well 
as Rieffel's more specialized strong Morita equiva- 
lence for C'-algebras have been transferred to 
deformation quantization and, more generally, to 
*-algebras A over C — R(i) by Bursztyn and Wald- 
mann (2001). The aim is to construct functors 


F : *-Rep(A)—> «-Rep(B) [22] 


which allow us to compare these categories and 
determine whether they are equivalent. But even if 
they are not equivalent, functors such as [22] are 
interesting. As example, one considers the situation 
of classical phase space reduction M ~> M, as it is 
present in every constraint system or gauge theory. 
Suppose one succeeded with the (highly nontrivial) 
problem of quantizing both classical phase spaces in 
a reasonable way whence one has quantum obser- 
vable algebras A and Aag. Then, of course, a 
relation between «-Rep(.A) and »-Rep(.A,4) is of 
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particular physical interest although one cannot 
expect both representation theories to be equivalent: 
A contains additional but physically irrelevant 
structure leading to possibly “more” representations. 

To get a clear picture of the Morita theory, one 
has to extend the notion of *-representations to the 
following framework: for an auxiliary *-algebra D 
over C, one defines a pre-Hilbert right D-module to 
be a right D-module H together with a C-sesqui- 
linear D-valued inner product (-,):45 x 4 — D 
such that (ó,v)' and (¢,w-d)=(¢,w)d for de D 
and such that (-,-) is completely positive. This 
means ((¢;,¢;)) € M,(D)* for all ġ1,..., n, where, 
in general, an algebra element a€ A is called 
positive, a € .A*, if w(a) > 0 for all positive linear 
functionals w:.A — C. 

Then one defines V(H) analogously as for pre- 
Hilbert spaces leading to a definition of a 
*-representation m of A on a pre-Hilbert right D- 
module H. The corresponding category of *-represen- 
tations is denoted by *-Repp(.A). Clearly, elements in 
*-Repp(.A) are in particular (A, D)-bimodules. 

The advantage is that now one has a tensor 
product & taking care of the inner products as well. 
For *-algebras A, B, C, one has a functor 


& : x-Repg(C) x »-Rep4(B) — *-Repy(C) [23] 


which, on objects, is essentially given by &g. In fact, 
for F € »-Reps(C) and E € «-Rep,(B), one defines 
on the (C,.A)-bimodule F g E an .A-valued inner 
product by (x&ó,y&wv)-(ó,(x,y)-v), which 
turns out to be well defined and completely positive 
again. Then F & E is F &g E equipped with this 
inner product modulo its possibly nonempty degen- 
eracy space. 

By fixing one of the arguments of ©, one 


obtains the functor of Rieffel induction of 
*-representations 
Re : x-Repp(A) — «-Repp(B) [24] 


where € € *-Rep4(B) is fixed and Re(H) =E H for 
H € *-Repp(A). 

The idea of strong Morita equivalence is then to 
search for such bimodules € where Hg gives an 
equivalence of categories. In detail, this is accom- 
plished by the following definition, where, for 
simplicity, only unital «-algebras are considered. 


Definition 6 A (B,.A)-bimodule £ is called a strong 
Morita equivalence bimodule if it is equipped with 
completely positive inner products (:,-) , and (-,-), 
such that both inner products are full, in the sense 
that 


C-span((x, y) alx, y € €) =A [25] 


and analogously for (-,-),, and compatible, in the 
sense that 


(b -x,y) 4 = (x, b" -y) 4, (x-a,y)g = Gy: a" )g [26] 


(X y)g: Z — X 5,2) A [27] 


In this case, A and B are called strongly Morita 
equivalent. 


It turns out that this is indeed an equivalence 
relation and that strong Morita equivalence implies 
the equivalence of the representation theories: 


Theorem 7 For unital *-algebras over C, strong 
Morita equivalence is an equivalence relation. 


Theorem 8 If E is a strong Morita equivalence 
bimodule, then Re as in [24] is an equivalence of 
categories. 


Example 9 The fundamental example in Morita 
theory is that a unital *-algebra A is strongly Morita 
equivalent to the matrices M,(.A) via the (M,,(A), A)- 
bimodule A” where the inner product is (x, y) 4 一 
M o-ixpyi and (-,+)y,¢4) is uniquely determined by 
the compatibility condition [27]. 


An efficient way to encode the whole Morita 
theory of unital *-algebras over C is to collect all 
strong Morita equivalence bimodules modulo iso- 
metric isomorphisms of bimodules. Then the tensor 
product i makes this into a “large” groupoid 
whose units are the «-algebras themselves. This so- 
called Picard groupoid Pic then encodes everything 
one can say about strong Morita equivalence. In 
particular, the orbits of this groupoid are precisely 
the strong Morita equivalence classes of *-algebras. 
The isotropy groups are the Picard groups Pic(.A) 
which generalize the (outer) automorphism groups. 


Strong Morita Equivalence of Star 
Products 


This section considers star products from the view- 
point of the Morita equivalence. Here one can show 
that for .A — (C**(M)[[A]], x), the possible candidates 
of equivalence bimodules are formal power series of 
sections l*(E)[[A]] of vector bundles E — M. This 
follows as, on the one hand, strong Morita 
equivalence is compatible with the classical limit 
A—0 in the sense that it implies strong Morita 
equivalence of the classical limits. On the other 
hand, any (classical or quantum) equivalence bimo- 
dule is finitely generated and projective as right 
A-module. Thus, by the Serre-Swan theorem one 
obtains the sections of a vector bundle in the 
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classical limit. Now one can show that every vector 
bundle can uniquely (up to equivalence) be 
deformed such that T°(E)[[A]] becomes a right 
A-module. Thus, the only thing to be computed is 
which deformation »' is induced by this deformation 
of E for the endomorphisms I? (End(E))[[A]], since 
one can show that then the result will always be a 
strong Morita equivalence bimodule. The inner 
products come from deformations of a Hermitian 
fiber metric on E. 

Since every vector bundle E—M can be 
deformed in this manner in an essentially unique 
way, we arrive at a general global construction of 
a noncommutative field theory where the fields are 
sections of E endowed with a deformed bimodule 
structure. In the case where M is even a symplectic 
manifold, a simple extension of Fedosov’s construc- 
tion of a star product * gives a rather explicit 
formula for the deformed bimodule structure of 
l*(E)[[A]] including a construction of the deforma- 
tion (FP*(End(E))[[A]], «' ) which acts from the left. 
As usual in Fedosov's approach, the construction 
depends functorially on the choice of a connection 
VË for E. 

Returning to the question of strong Morita 
equivalence of star products, we see that the vector 
bundle E has to be a line bundle L since only in this 
case we have I"*(End(E)) = C*(M). Since the 
deformation of the Hermitian fiber metric is always 
possible and since two equivalent Hermitian star 
products are always *-equivalent, one can show that 
strong Morita equivalence is already implied by 
ring-theoretic Morita equivalence (the converse is 
true in general). 


Theorem 10 Star products are strongly Morita 
equivalent if and only if tbey are Morita equivalent. 


An analogous statement holds for C*-algebras, 
known as Beer's theorem (1982). 

In the symplectic case, the characteristic class c(x’) 
of the induced star product * can be computed 
explicitly leading to the following classification by 
Bursztyn and Waldmann (2002): 


Theorem 11 Let x,x be star products on a 
symplectic manifold M. Then x is (strongly) Morita 
equivalent to x if and only if there exists a symplecto- 
morphism p such that 


w*c(x’) — c(x) € 2xiHa au (M, Z) [28] 


A similar result in the general Poisson case was 
given by Jurčo, Schupp, and Wess (2002) based on 
Kontsevich's formality theorem. This approach is 
motivated by a careful investigation of noncommu- 
tative (scalar) field theories. 


Finally, it is worth mentioning that [28] has a very 
simple physical interpretation. Consider again a 
cotangent bundle T*O with a topologically non- 
trivial configuration space Q, for example, R?(0]. 
Then there is a canonical Weyl-type star product 
*wey| depending on the choice of a connection V and 
an integration density 1 > 0, generalizing [7] to a 
curved situation. Now let B be a magnetic field, 
modeled as a closed 2-form on Q. Minimal coupling 
leads to a new star product xweyl describing an 
electrically charged particle moving in Q in the 
external field B. Then the two star products xw. 
and «X, are (strongly) Morita equivalent if and 
only if the magnetic field satisfies Dirac's integrality 
condition for the (possibly nontrivial) magnetic 
charges described by B. Thus, Dirac's condition 
is responsible for the very strong statement that the 
quantizations with and without magnetic field 
are Morita equivalent. In particular, the *-represen- 
tation theories of *wey; and Weyl are equivalent. 
Even more specifically, using B to construct a line 
bundle L — Q one obtains the result that Dirac’s 
-representation of x% q on Py (L)[[A]] is precisely 
the Rieffel induction of the Schródinger representa- 
tion of xwey; on Ci (O)[[A]]. 


See also: Aharonov-Bohm Effect; Algebraic Approach to 
Quantum Field Theory; Deformation Quantization; 
Deformation Theory; Deformations of the Poisson 
Bracket on a Symplectic Manifold; Fedosov Quantization. 
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Introduction and Historical Remarks 


In mathematical deformation theory one studies how 
an object in a certain category of spaces can be varied 
as a function of the points of a parameter space. In 
other words, deformation theory thus deals with the 
structure of families of objects like varieties, singula- 
rities, vector bundles, coherent sheaves, algebras, or 
differentiable maps. Deformation problems appear in 
various areas of mathematics, in particular in algebra, 
algebraic and analytic geometry, and mathematical 
physics. According to Deligne, there is a common 
philosophy behind all deformation problems in 
characteristic zero. It is the goal of this survey to 
explain this point of view. Moreover, we will provide 
several examples with relevance for mathematical 
physics. i 

Historically, modern deformation theory has its 
roots in the work of Grothendieck, Artin, Quillen, 
Schlessinger, Kodaira-Spencer, Kuranishi, Deligne, 
Grauert, Gerstenhaber, and Arnol'd. The applica- 
tion of deformation methods to quantization 
theory goes back to  Bayen-Flato-Fronsdal- 
Lichnerowicz-Sternheimer, and has led to the 
concept of a star product on symplectic and 
Poisson manifolds. The existence of such star 
products has been proved by de Wilde-Lecomte 


Nest R and Tsygan B (1995) Algebraic index theorem. Commu- 
nications in Mathematical Physics 172: 223-262. 

Omori H, Maeda Y, and Yoshioka A (1991) Weyl manifolds and 
deformation quantization. Advanced Mathematics — 85: 
224-255. 

Waldmann S (2002) On the representation theory of deformation 
quantization. In: Halbout G (ed.) Deformation Quantization, 
IRMA Lectures in Mathematics and Theoretical Physics, 
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tion quantization. Reviews of Mathematical Physics 17: 
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Varchenko A, and Vassiliev V (eds.) Geometry of Differential 
Equations, pp. 177-194. Dedicated to VI Arnol'd on the 
occasion of his 60th birthday. Providence, RI: American 
Mathematical Society. 


and Fedosov for symplectic and by Kontsevich for 
Poisson manifolds. 

Recently, Fukaya and Kontsevich have found a 
far-reaching connection between general deforma- 
tion theory, the theory of moduli, and mirror 
symmetry. Thus, deformation theory comes back to 
its origins, which lie in the desire to construct 
moduli spaces. Briefly, a moduli problem can be 
described as the attempt to collect all isomorphism 
classes of spaces of a certain type into one single 
object, the moduli space, and then to study its 
geometric and analytic properties. The observations 
by Fukaya and Kontsevich have led to new insight 
into the algebraic geometry of mirror varieties and 
their application to string theory. 


Basic Definitions and Examples 


Deformation theory is based on the notion of a 
ringed space, so we briefly recall its definition. 


Definition 1 Let k be a field. By a k-ringed space 
one understands a topological space X together with 
a sheaf A of unital k-algebras on X. The sheaf A will 
be called the structure sheaf of the ringed space. In 
case each of the stalks A,,x € X, is a local algebra, 
that is, has a unique maximal ideal my, one calls 
(X,.A) a locally k-ringed space. Likewise, one defines 
a commutative k-ringed space as a ringed space 
such that the stalks of the structure sheaf are all 
commutative. 


Given two k-ringed spaces (X,A) and (Y,B), a 
morphism from (X, A) to (Y, B) is a pair (f, p), where 


f :X — Y is a continuous mapping and y:f 'B— Aa 
morphism of sheaves of algebras. This means in 
particular that for every point x € X there is a 
homomorphism of algebras yy : Bj; > Ax induced 
by y. Under the assumption that both ringed spaces 
are local, (f,«) is called a morphism of locally ringed 
spaces, if each yx is a homomorphism of local 
k-algebras, that is, maps the maximal ideal of Brix 
to the one of Ay. 

Clearly, k-ringed spaces (resp. locally or commu- 
tative k-ringed spaces) together with their morphisms 
form a category. The following is a list of examples of 
ringed spaces, in particular of those which will be 
needed later. 


Example 2 


(i) Denote by C™ the sheaf of smooth functions on 

R”, by C" the sheaf of real analytic functions, 

and let O be the sheaf of holomorphic functions 

on C". Then (R”,C™), (R", C"), and (C", O) are 

ringed spaces over R resp. C. 

A differentiable manifold of dimension z can be 

understood as a locally R-ringed space (M, C5;) 

which locally is isomorphic to (R", C~). Likewise, 

a real analytic manifold is a ringed space (M, Ch) 

which locally can be modeled by (R", C^), and a 

complex manifold is an (M, Oy) which locally 

looks like (C", ©). 

(iii) Let D be a domain in C", and 7 an ideal sheaf 
in Op of finite type, which means that 7 is 
locally finitely generated over Op. Let Y be the 
support of the quotient sheaf Op/7. The pair 
(Y,Oy), where Oy denotes the restriction of 
Op/J to Y, then is a ringed space, called a 
complex model space. A complex space now is 
a ringed space (X, Ox) which locally looks like 
a complex model space (cf. Grauert and 
Remmert 1984). 

(iv) Let k be an algebraically closed field, and A” 
the affine space over k of dimension z. Then 
A”, together with the sheaf of regular functions, 
Is a ringed space. 

(v) Given a ring A, its spectrum Spec A together 

with the sheaf of regular functions Oa forms a 

ringed space (cf. (Hartshorne (1997), section 

II.2)). One calls (Spec A, O4) an affine scheme. 

More generally, a scheme is a ringed space 

(X, Ox) which locally can be modeled by affine 

schemes. 

Finally, if A is a local k-algebra, the pair («, A) 

can be understood as a locally ringed space. 

With A the algebra of formal power series k|[£]] 

over one variable 7, this example plays an 

important role in the theory of formal deforma- 
tions of algebras. 


—á 


(11 


— 
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Figure 1 A fibered space. 


Definition 3 A morphism (f,~):(Y,B8) — (P,S) of 
ringed spaces is called fibered, if the following 
conditions are fulfilled: 


(i) (P, S) is a commutative locally ringed space; 
(ii) f: Y — P is surjective; and 
(ii) py: Sy) > By maps Sy) into the center of By 
for each y € Y. 


The fiber of (f,q«) over a point p EP then is the 
ringed space (Yp, Bp) defined by 


Y, —f '(p), By = By-1(p)/MpBy-(p) 
where mp is the maximal ideal of S, which acts on 
By (p) via p. 


A fibered morphism of ringed spaces can be 
pictured in Figure 1. 

Additionally to this intuitive picture, conditions 
(i)—(iii) imply that the stalks B, are central exten- 
sions of By/MfyBy by Spy). 


Definition 4 Let (P,S) be a commutative locally 
ringed space over a field k with P connected, let « be 
a fixed point in P, and (X,A) a k-ringed space. 
A deformation of (X, A) over the parameter space 
(P,S) with distinguished point * then is a fibered 
morphism (f): (Y, B) — (P, S) over k together with 
an isomorphism (i, z) : (X,.A) — (Y,, B,) such that for 
all peP and yef (p) the homomorphism 
py:Sp — By is flat. 


The condition of flatness in the definition of a 
deformation serves as a substitute for “local trivi- 
ality” and works also in the presence of singularities. 
(see Palamodov (1990), section 3) for a discussion of 
this point. 

In the remainder of this section, we provide a list 
of some of the most important deformation pro- 
blems in mathematics, and show how these can be 
formulated within the above language. 


Products of |-Ringed Spaces 


Let (X,A) be any k-ringed space and (P,S) a 
k-scheme. For any closed point * € P, the product 
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(X x P, B) - (X,.A) xy (P,S) then is a flat deforma- 
tion of (X,.A) with distinguished point *. This can 
be seen easily from the fact that Bix, p) = Ax & Sp for 
every x € X and p c P. 


Families of Matrices as Deformations 


Let (P, Op) be a complex space with distinguished 
point * and Ap: P — Mat(n x n, C) a holomorphic 
family of complex n x n matrices over P. By the 
following construction, Ap can be understood as a 
deformation, more precisely as a deformation of the 
matrix A:=Ap(x). Let Y be the graph of Ap in the 
product space P x Mat(n x n,C) and f:Y —P be 
the restriction of the projection onto the first 
coordinate. Define the sheaf B as the inverse image 
sheaf f^! S, and let y be the sheaf morphism which 
for every y € Y is induced by the identity map 
Py : Spy) > By:— Sy. It is then immediately clear 
that (f,y) is a deformation of the fiber f^! («) and 
that this fiber coincides with the matrix A. 

Now let A be an arbitrary complex n x z-matrix, 
and choose a GL(z,C)-slice through A, that is, a 
submanifold P containing A which is transversal to the 
GL(z, C)-orbit through A. Hereby, it is assumed that 
GL(n, C) acts by the adjoint action on Mat(n x n, C). 
The family Ap given by the canonical embedding 
P — Mat(n x n, C) now is a deformation of A. The 
germ of this deformation at * is versal in the sense 
defined in the next section. 


Deformation of a Scheme à la Grothendieck 


Assume that (P,S) is a connected scheme over k. A 
deformation of a scheme (X,.A) then is a deforma- 
tion (f,q):(Y,B) —^ (P,S) in the sense defined 
above, together with the requirement that f : Y — P 
is a proper map, that is, f ' (K) is compact for every 
compact K C P. As a particular example, consider 
the k-scheme Y = Spec k[x, y, t]/(xy — t]. It gives rise 
to a fibration Y — Spec k|ż], whose fibers Y, with 
a € k are hyperbolas xy =a, when a Æ 0, and consist 
of the two axes x — 0 and y=0, when a=0. For 
k—R, this deformation can be illustrated as in 
Figure 2. 

For further information on this and similar 
examples, see Hartshorne (1977), in particular 
example 3.3.2. 


Deformation of a Complex Space 


According to Grothendieck, one understands by a 
deformation of a complex space (X, A) a morphism 
of complex spaces (f,y):(Y,B) — (P,S) which is 
both a proper flat morphism of complex spaces and 
a deformation of (X,.A) as a ringed space. In case 
(X,.A) and (P,S) are complex manifolds and if P is 


Figure 2 Deformation of the coordinate axes. 


connected, each of the fibers Y, is a compact 
complex manifold. Moreover, the family (Yp)pep 
then is a family of compact complex manifolds in 
the sense of Kodaira-Spencer (cf. Palamodov 


(1990)). 


Deformation of Singularities 


Let p be a point of some C". Two complex spaces 
(X, Ox) C (C", O) and (X', Oy) C (C",O) with x € 
XX’ are then called germ equivalent at x if there 
exists an open neighborhood U € C" of x such that 
XOU - X'nU. Obviously, germ equivalence at x is 
an equivalence relation indeed. We denote the equiva- 
lence class of X by [X],. Clearly, if [X], = [X'],, then 
one has Oxx=Oxx for the stalks at x. By a 
singularity one understands a pair ([X],, Ox.x). In the 
literature, such a singularity is often denoted by (X, x). 
The singularity (X, x) is called nonsingular or regular if 
Óx.« is isomorphic to an algebra of convergent power 
series C[z1,...,24]. A deformation of a complex 
singularity (X,x) over a complex germ (P, *) is a 
morphism of ringed spaces ([Y],, Oy.x) — ([P],, Op..) 
which is induced by a holomorphic map and which is 
a deformation of ([X],, Ox .) as a ringed space. See 
Artin (1976) and the overview article by Greuel (1992) 
for further details and a variety of examples. 


First-Order Deformation of Algebras 


Consider a k-algebra A and the truncated poly- 
nomial algebra S= k[c]/e^k[e]. Furthermore, let a: 
A x A — A be a Hochschild 2-cocycle of A; in other 
words, assume that the relation 


410(42,43) — 0(a1a2, a3) + a(a1,a243) 
— a(d1,42)a3 = 0 [1] 


holds for all 41,452,235 € A. Then one can define a 
new k-algebra B, whose underlying linear structure 


is isomorphic to A ®, $ and whose product is given 
by the following construction: any element b € B 
can be written uniquely in the form b — a + 418, 
with a9,a, € A. Then the product of b — ao + aye € B 
and b' =a} +a‘ € B is given by 


b- b' = agag + [a(ao, ap) + aod, --a1ag]e — [2] 


By condition [1], this product is associative. One 
thus obtains a flat deformation A:S— B of the 
algebra A and calls it the first-order or infinitesimal 
deformation of A along the Hochschild cocycle a. 
For further information on this and the connection 
between deformation theory and Hochschild coho- 
mology, see the overview article by Gerstenhaber 
and Schack (1986). 


Formal Deformation of an Algebra 


Let us generalize the preceding example and explain 
the concept of a formal deformation of an algebra 
by Gerstenhaber. Assume again A to be an arbitrary 
k-algebra and choose bilinear maps a,:A x A— A 
for n € N such that ao is the product on A and a; is 
a Hochschild cocycle. Furthermore, let S be the 
algebra k[[t]] of formal power series in one variable 
over k. Then define on the linear space B = A[[1]] of 
formal power series in one variable with coefficients 
in A the following bilinear map: 


x:BxB—B 
(Se E $3267 al» Yo Aml (aj, bi)t f i3] 
nc neN ncN ime EN 


If B together with * becomes a k-algebra or, in other 
words, if « is associative, one can easily see that it 
gives a flat deformation of A over S— k[[7]]. In that 
case, one says that B is a formal deformation of A 
by the family (o,),-x. Contrarily to the preceding 
example, there might not exist for every Hochschild 
cocycle a on A a formal deformation B of A defined 
by a family (an),cN such that a@j=a. In case it 
exists, we will say that the deformation B of A is in 
the direction of a. If the third Hochschild cohomol- 
ogy group H?(A,A) vanishes, there exists for every 
Hochschild cocycle œ on A a deformation B of A in 
the direction of o (see again Gerstenhaber and 
Schack (1986) for further details). 


Formal Deformation Quantization of Symplectic 
and Poisson Manifolds 


Let us consider the last two examples for the case 
where A is the algebra C* (M) of smooth functions on 
a symplectic or Poisson manifold M. Then the Poisson 
bracket {,} gives a Hochschild cocycle on C™(M). 
There exists a first-order deformation of C* (M) along 
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(1/2i){,} and, even though HH*(A,A) might not 
always vanish, a deformation quantization of M, that 
means a formal deformation of C*"(M) in the 
direction of the Poisson bracket (1/21)(,]. For the 
symplectic case, this fact has been proved first by 
deWilde-Lecomte using methods from Hochschild 
cohomology theory. A more geometric and intuitive 
proof has been given by Fedosov (1996). The Poisson 
case has been settled in the work of Kontsevich 
(2003) (see also the section “Deformation quantiza- 
tion of Poisson manifolds”). 


Quantized Universal Enveloping Algebras 
According to Drinfeld 


A quantized universal enveloping algebra for a 
complex Lie algebra q is a Hopf algebra A over 
C[[:]] such that A is a topologically free C[[ż]]- 
module (ie., A=(A/tA)|[t]] as left C[[#]]-module) 
and A/tA is the universal enveloping algebra Ug of q. 
Because A is a topologically free C[[7]]-module, A is a 
flat C[[7]]Á module and thus a deformation of Ug over 
C][t]]. See Drinfel'd (1986) and the monograph by 
Kassel (1995) for further details and examples of 
quantized universal enveloping algebras. 


Quantum Plane 


Consider the tensor algebra T= CD, (R^)?" of 
the two-dimensional real vector space R?, and let 
(x,y) be the canonical basis of R?. Then form the 
tensor product sheaf Tce = T &g Oc: and let Tæ be 
the ideal sheaf in J; generated by the relation 


x@y—zy@x=0 [4] 


where z:C' — C is the identity function. The 
quotient sheaf B= Bœ — Te: /Tc then is a sheaf of 
C-algebras and an O---module. Using eqn [4] now 
move all occurrences of x in an element of Bc to the 
right of all y's. Since 1/z is an element of O(C"), one 
can thus show that Be is a free Oc«-module. Hence, 
Be is flat over Oc. Further, it is easy to see that for 
every q € C' the C-algebra A, = B,/m,B, is freely 
generated by elements x, y with relations 


x@y—qy@x=0 [5] 


We call A, the q-deformed quantum plane and 
B=B(C") the over C* universally deformed quan- 
tum plane. Altogether, one can interpret B as 
a deformation of A, over C’, in particular as a 
deformation of A; = T &r C=C|x,y], the algebra 
of complex polynomials in two generators. 

In the same way, one can deform function 
algebras on higher-dimensional vector spaces as 
well as function algebras on certain Lie groups. 
In this manner, one obtains the quantum group 
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SU,(2) as a deformation of a Hopf algebra of 
functions on SU(2). See, for example, the work of 
Faddeev—Reshetikhin—Takhtajan (1990), Manin (1988) 
and Wess-Zumino (1990) for more information on 
q-deformations of vector spaces, Lie groups, differ- 
ential calculi, etc. 


Versal Deformations 


In this section, and the ones that follow, we consider 
only germs of deformations, that is, deformations 
over parameter spaces of the form (*, S). This means 
in particular that the structure sheaf only consists 
of its stalk S at x, a commutative local k-algebra. Let 
us now suppose that the sheaf morphism 
y:(Y,B) 一 (*,S) (over the canonical map Y — *) 
is a deformation of the ringed space (X, A) and that 
T:T — S is a homomorphism of commutative local 
k-algebras. Then the sheaf morphism 7*:B @s T — 
T with (r*y),(t)=1@t for ye Y and teT is 
a deformation of (X,.A) over the parameter space 
(x, T). One says that the deformation 7*4 is induced 
by the homomorphism 7. 


Definition 5 A deformation y:(Y,B) —^ S of 
(X,.A) is called versal if every (germ of a) deforma- 
tion of (X,.A) is isomorphic to a deformation germ 
induced by a homomorphism of k-algebras 7: T — S. 
A versal deformation is called universal, if the 
inducing homomorphism 7:T — S$ is unique, and 
miniversal if S is of minimal dimension. 


Example 6 


(i) In the section “Families of matrices as deforma- 
tions," the construction of a versal deformation 
of a complex matrix A has been sketched. 

(ii) According to Kuranishi, every compact com- 

plex manifold has a versal deformation by an 

analytic germ. See Kuranishi (1971) for a 

detailed exposition and the section “The 

Kodaira-Spencer algebra controlling deforma- 

tions of compact complex manifolds” for a 

description of the principal ideas. 

Grauert has shown that for isolated singularities 

there exists a versal analytic deformation. 

(iv) By the work of Douady—Verdier, Grauert, and 

Palamodov one knows that for every compact 

complex space there exists a miniversal analytic 

deformation. One of the essential methods in 
the existence proof hereby is Palamodov’s 
construction of the cotangent complex (see# 

Palamodov (1990). 

Bingener (1987) has further established 

Palomodov’s approach and thus has provided a 


— 


(iii 


(v 


— 


unified and quite general method for construct- 
ing versal deformations in analytic geometry. 

(vi) Fialowski-Fuchs have constructed miniversal 
deformations of Lie algebras. 


Schlessinger's Theorem 


According to Grothendieck, spaces in algebraic 
geometry are represented by functors from a category 
of commutative rings to the category of sets. In this 
picture, an affine algebraic variety X over the base 
field k and with coordinate ring A is equivalently 
described by the functor Hom,),(A, —) defined on the 
category of commutative k-algebras. As will be 
shown by examples in the next section, versal 
deformations are often encoded by functors repre- 
senting spaces. More precisely, a deformation pro- 
blem leads to a so-called functor of Artin rings, which 
means a covariant functor F from the category of 
(local) Artinian k-algebras to the category of sets such 
that the set F(k) has exactly one element. The 
question now arises as to under which conditions 
the functor F is representable, that is, there exists 
a commutative k-algebra A such that FS 
Homale(4, —). In the work of Schlessinger (1968), 
the structure of functors of Artin rings has been studied 
in detail. Moreover, criteria have been established, 
when such a functor is pro-presentable, which means 
that it can be represented by a complete local 
algebra A, where “completeness” is understood 
with respect to the m-adic topology. Because of its 
importance for deformation theory, we will state 
Schlessinger’s theorem in this section. Before we 
come to its details, let us recall some notation. 


Definition 7 By an Artinian k-algebra over a field k 
one understands a commutative k-algebra R which 
satisfies the following descending chain condition: 


(Dec) Every descending chain > --- D Ip D 


Ik}; D ---of ideals in R becomes stationary. 


Among others, an Artinian algebra R has the 
following properties: 


1. R is Noetherian, that is, it satisfies the ascending 
chain condition. 

2. Every prime ideal in R is maximal. 

3. (Chinese remainder theorem) R is isomorphic to 
a finite product II? , R;, where each Ri is a local 
Artinian algebra. 

4. Every maximal ideal m of R is nilpotent, that is, 
m^ — 0 for some k € N. 

5. Every quotient R/m^ with nt maximal is finite 
dimensional. 


Definition 8 Assume that f: B — A is a surjective 
homomorphism in the category k-Alg;44 of local 
Artinian k-algebras. Then f is called a small extension 
if ker f is a nonzero principal ideal (5) in B such that 
mb — (0), where nt is the maximal ideal of B. 


Theorem 9 (Schlessinger (1968, theorem 2.11)). 
Let F be a functor of Artin rings (over tbe base field 
k). Assume that A' — A and A" — A are morphisms 
in k-Alg, an, and consider the map 


F(A’ XA A") =% F(A’) x F(A) F(A”) [6] 
Then F is pro-representable if and only if F has the 
following properties: 


(H1) The map [6] is a surjection whenever A" — A 
is a small extension. 
(H2) The map [6] is a bijection, when A=k and 


A" = kje]. 
(H3) One has dimi(ts) < oo for the tangent space 
tp := F(k[e]). 


(H4) For every small extension A' — A, tbe map 
F(A' XA A’) 一 小 F(A’) X F(A) F(A’) 
is an isomorphism. 


Suppose that the functor F satisfies conditions 
(H1)-(H4), and let A be an arbitrary complete local 
k-algebra. By Yoneda's lemma, every element 


£ = projlim£, € A = proj lim A/m"A 


ncN neN 


induces a natural transformation 
Homgig(A,—) >F, (w:A— R)esFGs)(&) [7] 


where n € N is chosen large enough such that the 
homomorphism 4:4 一 R factors through some 
ün: A/m” — R. This is possible indeed, since R is 
Artinian. In the course of the proof of Schlessinger's 
theorem, A and the element £ € A are now con- 
structed in such a way that [7] is an isomorphism. 


Differential Graded Lie Algebras 
and Deformation Problems 


According to a philosophy going back to Deligne 
"every deformation problem in characteristic zero is 
controlled by a differential graded Lie algebra, with 
quasi-isomorphic differential graded Lie algebras 
giving the same deformation theory" (cf. Goldman 
and Millson (1988), p. 48). In the following, we will 
explain the main idea of this concept and apply it to 
two particular examples. 
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Differential Graded Lie Algebras 


Definition 10 By a graded algebra over a field k 
one understands a graded k-vector space A*= 
rez A! together with a bilinear map 

Lp:A'xA'—A*, (a,b)ea-b- (a,b) 
such that A* - A! c A*"! for all k,l € Z. The graded 
algebra A* is called associative if (ab)c — a(bc) for all 
a,b,c € A*. 

A graded subalgebra of A* is a graded subspace 
B* = Q3, 7 B® c A* which is closed under p, a 
graded ideal is a graded subalgebra I° C A* such 
that I* : A* CP and Á* - I* c hh. 

A homomorphism between graded algebras A* 
and B* is a homogeneous map f : A* — B® of degree 


0 such that f(a - b) ^ f(a) - f(b) for all a, b € A*. 


From now on, assume that k has characteristic 
#2,3. A graded Lie algebra then is a graded 
k-vector space a* = @,-7 a^ together with a bilinear 
map 


[,]:a8*xg' 5 g, (a,b) [a,b] 


such that the following axioms hold true: 


1. [a^, a/] c gf" for all k,l € Z. 

2. [£, C] 5 — C1)" (6 £] for all 35 a^ C € gf. 

3. (1) [é 6], &] + (71) 8 lé, £], &] + 
(-1)9 [I£5, £i], &] = 0 for all & € g* with 
j= 4.4 


By axiom (1), it is clear that a graded Lie algebra is 
in particular a graded algebra. So the above-defined 
notions of a graded ideal, homomorphism, etc., apply 
as well to graded Lie algebras. 


Example 11 Let A* = Dz A^ be a graded asso- 
ciative algebra. Then A* becomes a graded Lie 
algebra with the bracket 


[a,b] = ab — (1) ba for a € A* and be A! 


The space A* regarded as a graded Lie algebra is 
often denoted by l[ie*(A*). 


Definition 12 A linear map D: A* — A* defined 
on a graded algebra A* is called a derivation of 
degree | if 


D(ab) = (Da)b + (C1) " a(Db) 
for all a € A? and b c A* 


A graded (Lie) algebra A* together with a 
derivation d of degree 1 is called a differential 
graded (Lie) algebra if dod=0. Then (A*,d) 


becomes a cochain complex. Since ker d is a graded 
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subalgebra of A* and im d a graded ideal in ker d, 
the cohomology space 


H*(A*,d) = ker d/im d 
inherits the structure of a graded (Lie) algebra from A*. 


Let f: A* — B* be a homomorphism of differen- 
tial graded (Lie) algebras (A*, d) and (B°, 0). Assume 
further that f is a cochain map, that is, that f o d= 
O o f. Then one says that f is quasi-isomorphism or 
that the differential graded (Lie) algebras A* and B* 
are quasi-isomorphic if the induced homomorphism 
on the cohomology level f : H*(A*, d) — H*(B*,0) is 
an isomorphism. Finally, a differential graded (Lie) 
algebra (A*,d) is called formal if it is quasi- 
isomorphic to its cohomology (H*(A*, d), 0). 


Maurer-Cartan Equation 


Assume that (a*, [-,-], d) is a differential graded Lie 
algebra over C. Define the space MC(q*) of 
solutions of the Maurer-Cartan equation by 


MC(a*) := {w € a! | dw — Hw, w) = 0} [8] 


In case the differential graded Lie algebra 0” is 
nilpotent, this space naturally possesses a groupoid 
structure or, in other words, a set of arrows which are 
all invertible. The reason for this is that, under the 
assumption of nilpotency, the space qa? is equipped 
with the Campbell-Hausdorff multiplication 


g? x g? a^, (X,Y) log(exp X, exp Y) 


and the group a? acts on q! by the exponential 
function. More precisely, in this situation one can 
define for two objects œ, 8 € MC(q*) the space of 
arrows a—/3 as the set of all Aca? such that 
exp À- a= f. 

We have now the means to define for every 
complex differential graded Lie algebra q° its 
deformation functor Def,-. This functor maps the 
category of local Artinian C-algebras to the category 
of groupoids and is defined on objects as follows: 


Defa (R) := MC(a* & m) [9] 


Hereby, R is a complex local Artinian algebra, and 
m its maximal ideal. Note that since R is Artinian, 
a* & nt is a nilpotent differential graded Lie algebra, 
hence Def,(R) carries a groupoid structure as 
constructed above. Clearly, Defy. is also a functor 
of Artin rings as defined in the previous section. 
With appropriate choices of the differential 
graded Lie algebra q*, essentially all deformation 
problems from the section “Basic definitions and 
examples" can be recovered via a functor of the 


form Def,-. Below, we will show in some detail how 
this works for two examples, namely the deforma- 
tion theory of complex manifolds and the deforma- 
tion quantization of Poisson manifolds. But before 
we come to this, let us state a result which shows 
how the deformation functor behaves under quasi- 
isomorphisms of the underlying differential graded 
Lie algebra. This result is crucial in a sense that it 
allows to equivalently describe a deformation 
problem with controlling a* by any other differential 
graded Lie algebra within the quasi-isomorphism 
class of a*. So, in particular in the case where the 
differential graded Lie algebra is formal, one often 
obtains a direct solution. of the deformation 
problem. 


Theorem 13  (Deligne, Goldman-Millson). Assume 
that f:q*—b' is a  quasi-isomorphism of 
differential graded Lie algebras. For every local 
Artinian C-algebra R tbe induced functor f: 
Def,.(R)— Deft*(R) then is an equivalence of 
groupoids. 


The Kodaira-Spencer Algebra Controlling 
Deformations of Compact Complex Manifolds 


Let M be a compact complex n-dimensional mani- 
fold. Recall that then the complexified tangent 
bundle TM has a decomposition into a holomor- 
phic tangent bundle T^?M and an antiholomorphic 
tangent bundle T?! M. This leads to a decomposi- 
tion of the space of complex z-forms into the spaces 
Q^:4 M of forms on M of type (p,q). More generally, 
a smooth subbundle /?! C TM which induces a 
decomposition of the form Tc M — J^? & J^! . where 
J+? :— ]%1, is called an almost complex structure on 
M. Clearly, the decomposition of ToM into the 
holomorphic and antiholomorphic part is an almost 
complex structure, and an almost complex structure 
which is induced by a complex structure is called 
integrable. Assume that an almost complex structure 
J>! is given on M and that it has finite distance to 
the complex structure on M. The latter means that 
the restriction 0j | of the projection o: T-M — T^! M 
along T^"M to the subbundle J^! is an isomor- 
phism. Denote by 5 the inverse of "e and let w € 
Q9! (M, T^9?M) be the composition —o o 8. One 
checks immediately that every almost complex 
structure with finite distance to the complex 
structure on M is uniquely characterized by a 
section w € Q9! (M, T^? M) and that every element 
of Q9! (M, T^? M) comes from an almost complex 
structure on M. 

As a consequence of the Newlander-Nirenberg 
theorem, one can now show that the almost 


=) 


complex structure J^! resp. w is integrable if and 
only if the equation 


Ow — lw, w] = 0 [10] 


is fulfilled. But this is nothing else than the Maurer- 
Cartan equation in the Kodaira-Spencer differential 
graded Lie algebra 


(2",5,[-, ]) = (& Q*^(M, T! 9M), à, [., i 
peN 

Hereby, Q9^(M, T^?M) denotes the T? M-valued 
differential forms on M of type (0,p),8: 
QOP(M, TLOM) — Q9:?*1(M, TLOM) the Dolbeault 
operator, and |-,-] is induced by the Lie bracket 
of holomorphic vector fields. As a consequence of 
these considerations, deformations of the complex 
manifold M can equivalently be described by families 
(wp)peP C Q! which satisfy eqn [10] and w, — 0. Thus, 
it remains to determine the associated deformation 
functor Def... 

According to Schlessinger's theorem, the functor 
Defg。 is pro-representable. Hence, there exists a 
local C-algebra Rg» complete with respect to the 
nt-adic topology such that 


Defe. (R) 一 Homaig( Res, R) [11] 


for every local Artinian C-algebra R. Moreover, by 
Artin’s theorem, there exists a “convergent” solution 
of the Maurer-Cartan equation, that is, Rọ» can be 
replaced in eqn. [11] by a ring Rg. representing an 
analytic germ. 


Theorem 14 (Kodaira-Spencer, Kuranishi). The 


ringed space (Ry-,(0)) is a miniversal deformation 
of the complex structure on M. 


Deformation Quantization of Poisson Manifolds 


Let A be an associative k-algebra with char k= 0. 
Put for every integer k > —1 


q^ _ Hom, (A? **V. A) 


Then g° becomes a graded vector space. Let us 
impose a differential and a bracket on q*. The 
differential is the usual Hochschild coboundary 
b:a* ome a^, 
bf (ao & --- 441) 
:= dof (a1 Q +++ 8 apy1) 
k 


+X (-1)""f(ao &---G ajiau1G +++ @ apy) 
i=0 


T (—1) f (ao G-.-924)4p1 
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The bracket is the Gerstenhaber bracket 


[:， |] q^! x a ai gt 


A. fl :- fi ofa — (-1)" ^ of 


where 


fi o f;(ao Dee & Ap, +k, ) 
ki l 
a 》 (-1)^f, (ao ++ @aj_-1 D f2(aj &--- Gaii) 
i=0 


C9 di h.l Qs @ ak, +k) 


The triple (a*, 5,[-,-]) then is a differential graded 
Lie algebra. 

Consider the Maurer-Cartan equation by— 
(1/2), 5] 20 in g!. Obviously, it is equivalent to 
the equality 


ayy(41,42) — y(40a1, 42) + (40,4142) — (40, 41)a2 
- "(^ (ao, 41), a2) "- (ao, (a1,a2)) 
for do,a;,a2 € A [12] 


If one defines now for some y € g! the bilinear map 
m:AxA—A by m(a,b) —ab + (a, b), then [12] 
implies that 77 is associative if and only if y satisfies 
the Maurer-Cartan equation. 

Let us apply these observations to the case where A 
is the algebra C* (M)[[t]] of formal power series in one 
variable with coefficients in the space of smooth 
functions on a Poisson manifold M. By (a variant of) 
the theorem of Hochschild-Kostant-Rosenberg and 
Connes, one knows that in this case the cohomology of 
(a*, b) is given by formal power series with coefficients 
in the space l* (A* TM) of antisymmetric vector fields. 
Now, lL*(A* TM) carries a natural Lie algebra bracket 
as well, namely the Schouten bracket. Thus, one 
obtains a second differential graded Lie algebra 
(P*(A* TM)I[t]], 0, [ - , - ]). Unfortunately, the projec- 
tion onto cohomology (a*, b) — P^(A*TM)[[t]] does 
not preserve the natural brackets, hence is not a quasi- 
isomorphism in the category of differential graded Lie 
algebras. It has been the fundamental observation by 
Kontsevich that this defect can be cured as follows. 


Theorem 15  (Kontsevich 2003). For every Poisson 
manifold M the differential graded Lie algebra 
(a*, 5b, [- ,-]) is formal in tbe sense that there exists 
a dquasi-isomorphism  (q*,b,|-,-]) ^ (FP*(A* TM) 
[[#]], 0, [: , -]) in the category of L*-algebras. 


Note that the theorem only claims the existence of 
a quasi-isomorphism in the category of L*-algebras 
or, in other words, in the category of homotopy Lie 
algebras. This is a notion somewhat weaker than a 
differential graded Lie algebra, but Theorem 13 also 
holds in the context of L*-algebras. 
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Since the solutions of the Maurer—Cartan equa- 
tion in ([™(A*TM)|[t]],0,[-,-]) are exactly the 
formal paths of Poisson bivector fields on M, 
Kontsevich’s formality theorem entails: 


Corollary 16 Every Poisson manifold has a formal 
deformation quantization. 


See also: Deformation Quantization; Deformation 
Quantization and Representation Theory; Deformations of 
the Poisson Bracket on a Symplectic Manifold; Fedosov 
Quantization; Holonomic Quantum Fields; Operads. 
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Introduction to Deformation Quantization 


The framework of classical mechanics, in its 
Hamiltonian formulation on the motion space, 
employs a symplectic manifold (or more generally a 
Poisson manifold). Observables are families of 
smooth functions on that manifold M. The dynamics 
is defined in terms of a Hamiltonian H € C*(M) and 
the time evolution of an observable f, € C*(M x R) 
is governed by the equation: (d/dz)f, = —(H, ft}. 

The quantum-mechanical framework, in its usual 
Heisenberg's formulation, employs a Hilbert space 
(states are rays in that space). Observables are 
families of self-adjoint operators on the Hilbert 
space. The dynamics is defined in terms of a 
Hamiltonian H, which is a self-adjoint operator, 


and the time evolution of an observable A, is 
governed by the equation dA,;/dt = (i/b)[H, A,]. 

Quantization of a classical system is a way to pass 
from classical to quantum results. A first idea for 
quantization is to define a correspondence 
Q:f — O(f) mapping a function f to a self-adjoint 
operator O(f) on a Hilbert space » in such a way 
that O(1) — Id and [O(f), O(g)] 2 iO(lf, g]). Unfor- 
tunately, there is no such correspondence defined on 
all smooth functions on M when one puts an 
irreducibility requirement (which is necessary not 
to violate Heisenberg's principle). 

Different mathematical treatments of quantization 
have appeared: 


e Geometric quantization of Kostant and Souriau: 
first, prequantization of a symplectic manifold 
(M,w) where one builds a Hilbert space and a 
correspondence Q defined on all smooth functions 
on M but with no irreducibility; second, polariza- 
tion to *cut down the number of variables." 

e Berezin's quantization where one builds on a 
particular class of symplectic manifolds (some 


Deformations of the Poisson Bracket on a Symplectic Manifold 25 


Kahler manifolds) a family of associative algebras 
using a symbolic calculus, that is, a dequantiza- 
tion procedure. 

e Deformation quantization introduced by Flato, 
Lichnerowicz, and Sternheimer in 1976 where 
they “suggest that quantization be understood as 
a deformation of the structure of the algebra of 
classical observables rather than a radical change 
in the nature of the observables.” 


This deformation approach to quantization is part 
of a general deformation approach to physics 
(a seminal idea stressed by Flato): one looks at 
some level of a theory in physics as a deformation of 
another level. 

Deformation quantization is defined in terms of a 
star product which is a formal deformation of the 
algebraic structure of the space of smooth functions 
on a Poisson manifold. The associative structure 
given by the usual product of functions and the Lie 
structure given by the Poisson bracket are simulta- 
neously deformed. 

In this article we concentrate on some mathema- 
tical results concerning deformations of the Poisson 
bracket on a symplectic manifold, classification of 
star products on symplectic manifolds, group actions 
on star products, convergence properties of some 
star products, and star products on cotangent 


bundles. 


Deformations of the Poisson Bracket 
on a Symplectic Manifold 


Definition 1 A Poisson bracket defined on the 
space of smooth functions on a manifold M is an 
R-bilinear map on C*(M), (u,v) — {u,v} such that 
for any u,v, w € C*(M): 


(1) UA vj = m u}; 
(11) tu, v}, w} T (lv, w}, u} T {{w, uj, vj = 0; 
(ini) (u,vw] = {u, v}w + [u, why. 


A Poisson bracket is given in terms of a contra- 
variant skew-symmetric 2-tensor P on M (called 
the Poisson tensor) by {u,v}=P(du Adv). The 
Jacobi identity for the Poisson bracket is equiva- 
lent to the vanishing of the Schouten bracket 
[P, P] 2 0. (The Schouten bracket is the extension - 
as a graded derivation for the exterior product — 
of the bracket of vector fields to skew-symmetric 
contravariant tensor fields.) A Poisson manifold 
(M, P) is a manifold M with a Poisson bracket 
defined by P. 


A particular class of Poisson manifolds, essential 
in classical mechanics, is the class of *symplectic 
manifolds." If (M,w) is a symplectic manifold (i.e., 


w is a closed nondegenerate 2-form on M) and if 
u,v € C*(M), the Poisson bracket of u and v is 


tah 7} i Av) = w( Ay, X.) 


where X, denotes the Hamiltonian vector field 
corresponding to the function z, that is, such that 
i(X,)w — du. In coordinates the components of the 
Poisson tensor P form the inverse matrix of the 
components uj; of w. 

. Duals of Lie algebras form the class of linear 
Poisson manifolds. If q is a Lie algebra, then its dual 
q* is endowed with the Poisson tensor P defined by 
PAX, Y):=é([X, Y]) for X, Y € a—(q*)* ~ (Te). 


Definition 2 A Poisson deformation of the Poisson 
bracket on a Poisson manifold (M, P) is a Lie 
algebra deformation of (C?(M),(,]) which is a 
derivation in each argument, that is, of the form 
{u,v}, — P,(du,dv), where P,—P--Y^»P, is a 
series of skew-symmetric contravariant 2-tensors 
on M (such that [P,, P,,]=0). 


Two Poisson deformations P, and P, of the 
Poisson bracket P on a Poisson manifold (M, P) 
are equivalent if there exists a formal path in the 
diffeomorphism group of M, starting at the identity, 
that is, a series T — exp D —Id + $5, (1/;!)D! for 
D= $,. VD, where the D, are vector fields on M, 
such that 


T(u,v), = (Tu, Tv], 
where {u,v}, — P,(du, dv) and (u, v}, = P! (du, dv). 


Proposition 3 (Flato et al. 1975, Lecomte 1987). 
On a symplectic manifold (M,w), amy Poisson 
deformation of tbe Poisson bracket corresponds to 
a series of closed 2-forms on M, Q, =w + J sor" wr 
and is given by 


{u,v},, = P,(du, dv) = fees X7) 


with 1(X¥)Q, = du. The equivalence classes of Poisson 


deformations of tbe Poisson bracket P are 
parametrized by H^(M; R)[[v]]. 
Poisson deformations are used in classical 


mechanics to express some constraints on the 
system. To deal with quantum mechanics, Flato 
et al. (1976) introduced star products. These give, 
by skew-symmetrization, Lie deformations of the 
Poisson bracket. 


Definition 4 A “star product” on (M, P) is an 
Riv]-bilinear associative product * on C™(M)iv] 
given by 


u*vy—u*yv:— M V Cuv) 


r>0 


26 Deformations of the Poisson Bracket on a Symplectic Manifold 


for u,v € C*(M) (we consider here real-valued 
functions; the results for complex-valued functions 
are similar), such that  Co(u,v) — uv, Ci(u,v) — 
Ci(v,u)  (u,v)],1  u—u*1-—u. 


When the C,’s are bidifferential operators on M, 
one speaks of a differential star product. When each 
C, is a bidifferential operator of order at most r in 
each argument, one speaks of a natural star product. 

One finds in the literature other normalizations 
for the skew-symmetric part of C, such as (i/2){,}; 
these amount to a rescaling of the parameter v. For 
physical applications, in the above convention for 
the formal parameter, v corresponds to ih, where 5 
is Planck's constant. 

In the case of complex-valued functions, one can 
add the further requirement that the complex con- 
jugation is a "-involution for x, that is, f * g—g * f. 
According to the interpretation of v as being ib, we 
have to require P= —v. Star products satisfying this 
additional property are called symmetric or Hermitian. 

A star product can also be defined not on the 
whole of C**(M) but on a subspace N which is stable 
under pointwise multiplication and Poisson bracket. 

The simplest example of a deformation quantiza- 
tion is the Moyal product for the Poisson structure P 
on a vector space V =R” with constant coefficients: 


P=) POS, p! 一 一 Pr ER 
ij 
where 0;=0/0y' is the partial derivative in the 
direction of the coordinate y',i=1,...,n. The 
formula for the Moyal product is 


Uv 


(u +m v)(z) = exp(5 P" ayay (Gv). — [1] 


When P is nondegenerate (so V — R7"), the space of 
formal power series of polynomials on V with 
Moyal product is called the formal Weyl algebra 
W =(S(V)[[v]], *m). 

Let q* be the dual of a Lie algebra g. The algebra of 
polynomials on q* is identified with the symmetric 
algebra S(q). One defines a new associative law on this 
algebra by a transfer of the product o in the universal 
enveloping algebra U(q), via the bijection between 
S(q) and U(q) given by the total symmetrization o: 


1 
a:S(q) — U(a) : Xi Xp FD Xp o +- O X p(k) 


pes, 


Then U(q)=@n>0 Un, where U, :=o(S"(q)) and we 
decompose an element u€U(q);_ accordingly 
u= ün. We define, for P € S?(q) and O € S?(q), 


Px*Q= > (vo ((e(P)os(Q), ,,) 2 


n0 


This yields a differential star product on q* (Gutt 
1983). This star product can be written with an 
integral formula (for v = 27i)(Drinfeld 1987): 


uxv(€) = | (XD Ye "EN dx dy 
axa 


2 u(n)e i= X) 


where Zz(X)— IL and CBH denotes 
Campbell-Baker-Hausdorff formula for the product 
of elements in the group in a logarithmic chart 
(exp X exp Y= exp CBH(X, Y) V X,Y € a). We call 
this the standard (or CBH) star product on the dual 
of a Lie algebra. 

De Wilde and Lecomte (1983) proved that on 
any symplectic manifold there exists a differential 
star product. Fedosov (1994) gave a recursive 
construction of a star product on a symplectic 
manifold (M,w) constructing flat connections on 
the Weyl bundle. Omori et al. (1991) gave an 
alternative proof of existence of a differential star 
product on a symplectic manifold, gluing local 
Moyal star products. In 1997, Kontsevich gave a 
proof of the existence of a star product on any 
Poisson manifold and gave an explicit formula for a 
star product for any Poisson structure on V=R”. 
This appeared as a consequence of the proof of his 
formality theorem. 


Fedosov's Construction of Star Products 


Fedosov's construction gives a star product on a 
symplectic manifold (M,w), when one has chosen a 
symplectic connection and a sequence of closed 
2-forms on M. The star product is obtained by 
identifying the space C*(M)[[v]] with an algebra of 
flat sections of the so-called Weyl bundle endowed 
with a flat connection whose construction is related 
to the choice of the sequence of closed 2-forms on M. 


Definition 5 The symplectic group Sp(z, R) acts by 
automorphisms on the formal Weyl algebra W. If 
(M,w) is a symplectic manifold, we can form its 
bundle F(M) of symplectic frames which is a 
principal Sp(z, R)-bundle over M. The associated 
bundle % = F(M)x sp(n,r) W is a bundle of associa- 
tive algebras on M called the Weyl bundle. Sections 
of the Weyl bundle have the form of formal series 


>, V ag, a (x)y ¥ 


2k--150 


a(x,y,v) = 


where the coefficients a) are symmetric covariant 
l-tensor fields on M. The product of two sections 
taken pointwise makes the space of sections into an 
algebra, and in terms of the above representation of 
sections the multiplication has the form 
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(a o b)(x,y,v) 


- V pij © O 
= (ep (SP Dy aj aes bos )) 


Note that the center of this algebra coincides with 
C* (M)I[T]]. 


y= 


A symplectic connection on M is a linear torsion- 
free connection V such that Vw — 0. 


Remark 6 It is well known that such connections 
always exist but, unlike the Riemannian case, are 
not unique. To see the existence, take any torsion- 
free connection V’ and set T(X, Y, Z) =(Vw)(Y, Z). 
Define S .by w(S(X, Y), Z)=(1/3)(T(X, Y, Z) + 
T(Y,X,Z), then VxY= V.Y -- (X, Y) defines a 
symplectic connection. 


The connection V induces a covariant derivative 
on sections of the Weyl bundle, denoted 9. The idea 
is to try to modify it to have zero curvature. 
Consider Da — ða — ó(a) — (1/v)[r,a], where r is a 
1-form with values in 7, with [a,a'dx] = (a o a' — 


a’ o a)dx and 6(a) = (1/v)| 35; wiy dx, a]. 


Theorem 7 (Fedosov 1994). For a given series 
Q— ui of closed 2-forms on M, there is a 
unique r € T(> & A!) satisfying some normalization 
condition, so that Da = ða — 6(a) — (1/v)|r, a] is flat. 
For any a, € C*(M)[[v]], there is a unique a in the 
subspace » p of flat sections of *, such that 
a(x,0,v)=ao(x,v). The use of this linear isomorpb- 
ism to transport the algebra structure of » p to 
C* (M)[Iv]] defines the star product of Fedosov *v.o. 


Writing *v,o = oj.) V C7, C only depends on w; 
for i<r and C} (u,v) =cw,(Xus X») + Cra (uv), 
where c € R and the last term does not depend on w. 


Classification of Star Products 
on a Symplectic Manifold 


Star products on a manifold M are examples of 
deformations of associative algebras (in the sense of 
Gerstenhaber). Their study uses the Hochschild 
cohomology of the algebra (here C*(M) with values 
in C*(M)) where p-cochains are p-linear maps from 
(C*(M)P to C*(M) and where the Hochschild 
coboundary operator maps the p-cochain C to the 
(p 4- 1)-cochain 


(OC)(uo. ..., 145) — uoC(u,... up) 
+ 》 (-1)'C(uo, UE US 


ir (—1)**!C(ug — aa Mp 1 )Hp 


For differential star products, we consider differen- 
tial cochains given by differential operators on each 
argument. The associativity condition for a star 
product at order k in the parameter v reads 


Y. (G(C.(u,v),w) 


r+s=k.r.s>0 


* C, (u, Cslv, w))) 


(OC, )(u,v,w) = 


If one has cochains C;,; « k such that the star 
product they define is associative to order k — 1, 
then the right-hand side above is a cocycle 
(O(RHS) — 0) and one can extend the star product 
to order k if it is a coboundary (RHS = 9(Cj)). 
Denoting by m the usual multiplication of func- 
tions, and writing * —77 + C, where C is a formal 
series of multidifferential operators, the associativity 
also reads 9C—[C, C] where the bracket on the 
right-hand side is the graded Lie algebra bracket on 
Day (M)[[v]] = {multidifferential operators]. 


Theorem 8 (Vey 1975). Every differential p-cocycle 
C on a manifold M is tbe sum of tbe coboundary of a 
differential (p — 1)-cocbain and a 1-differential skew- 
symmetric p-cocycle A: C — OB + A. In particular, a 
cocycle is a coboundary if and only if its total skew- 
symmetrization, which is automatically 1-differential 
in each argument, vanishes. Given a connection V on 
M, B can be defined from C by universal formulas 
(Cahen and Gutt 1982). Also 


H^... (C* (M), C*(M)) = P(A? TM) 


The similar result about continuous cochains is 
due to Connes (1985). In the somewhat pathological 
case of completely general cochains, the full coho- 
mology is not known. 


Definition 9 Two star products * and *' on (M, P) 
are said to be equivalent if there is a series of linear 
operators on C*(M), T 2 Id + 35 ,v'T, such that 


T(f *g) = Tf * Tg i3] 


Remark that the T, automatically vanish on con- 
stants since 1 is a unit for * and for +’. 


If * and *' are equivalent differential star products, 
then the equivalence is given by differential operators 
T,; if they are natural, the equivalence is given by 
T=ExpE with E=]; ,vE,, where the E, are 
differential operators of order at most r + 1. 

Nest and Tsygan (1995), then Deligne (1995) and 
Bertelson e£ al. (1995, 1997) proved that any 
differential star product on a symplectic manifold 
(M,w) is equivalent to a Fedosov star product and 
that its equivalence class is parametrized by the 
corresponding element in H?^(M; R)[[v]]. 
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Kontsevich (IHES preprint 97) proved that the 
coincidence of the set of equivalence classes of star 
and Poisson deformations is true for general Poisson 
manifolds: 


Theorem 10 (Kontsevich). The set of equivalence 
classes of differential star products on a Poisson 
manifold (M, P) can be naturally identified with 
the set of equivalence classes of Poisson deforma- 
tions of P: P, Pv + Par ++- E (X, A* Tx)[[v]], 
[Ps Pel = 9: 


Deligne (1995) defines cohomological classes 
associated to differential star products on a sym- 
plectic manifold; this leads to an intrinsic way to 
parametrize the equivalence class of such a differ- 
ential star product. The characteristic class c( * ) is 
given in terms of the skew-symmetric part of the 
term of order 2 in v in the star product and in terms 
of local (C*r-Euler") derivations of the form 
D —v(0/8v) + X--M,..v'D'. This characteristic 
class has the following properties: 


e The map C from equivalence classes of star 
products on (M,w) to the affine space —[w]/v + 
H?(M;R)[v] mapping [ * ] to c( * ) is a bijection. 

e The characteristic class is natural relative to 
diffeomorphisms and is equivariant under a 
change of parameter (Gutt and Rawnsley 1995). 

e The characteristic class c( * ) coincides (cf. Deligne 
(1995) and Neumaier (1999)) for Fedosov-type 
star products with their characteristic class intro- 
duced by Fedosov as the de Rham class of the 
curvature of the generalized connection used to 
build them (up to a sign and factors of 2). 


Index theory has been introduced in the frame- 
work of deformation quantization by Fedosov 
(1996) and by Nest and Tsygan (1995, 1996). We 
refer to the papers of Bressler, Nest, and Tsygan for 
further developments in that subject. A first tool in 
that theory is the existence of a trace for the 
deformed algebra; this trace is essentially unique in 
the framework of symplectic manifolds (an elemen- 
tary proof is given in Karabegov (1998) and Gutt 
and Rawnsley (2003)); the trace is not unique for 
more general Poisson manifolds. 


Definition 11 A homomorphism from a differen- 
tial star product * on (M,P) to a differential star 
product * on (M’,P’) is an R-linear map 
A:C*(M)lv]  C*(M'Uvl, continuous in the 
v-adic topology, such that 


A(u xv) = Au *' Av 


It is an isomorphism if the map is bijective. 


Any isomorphism between two differential star 
products on symplectic manifolds is the combination 
of a change of parameter and a v-linear isomorph- 
ism. Any v-linear isomorphism between two star 
products * on (M,w) and * on (M',w) is the 
combination of the action on functions of a 
symplectomorphism y: M' — M and an equivalence 
between * and the pullback via w of *’. It exists if 
and only if those two star products are equivalent, 
that is, if and only if (y )'c(*' )—c(*), where 
(Ww) denotes the action of v^! on the second 
de Rham cohomology space. In particular, a 
symplectomorphism :» of a symplectic manifold can 
be extended to a v-linear automorphism of a given 
differential star product on (M,w) if and only if 
(v)'c( *)—c(*) (Gutt and Rawnsley 1999). 

The notion of homomorphism and its relation to 
modules has been studied by Bordemann (2004). 

The link between the notion of star product on a 
symplectic manifold and symplectic connections 
already appears in the seminal paper of Bayen 
et al. (1978), and was further developed by 
Lichnerowicz (1982), who showed that any Vey 
star product (ie. a star product defined by 
bidifferential operators whose principal symbols at 
each order coincide with those of the Moyal star 
product) determines a unique symplectic connection. 
Fedosov's construction yields a Vey star product on 
any symplectic manifold starting from a symplectic 
connection and a formal series of closed 2-forms on 
the manifold. Furthermore, any star product is 
equivalent to a Fedosov star product and the 
de Rham class of the formal 2-form determines the 
equivalence class of the star product. On the other 
hand, many star products which appear in natural 
contexts (e.g., cotangent bundles or Kahler mani- 
folds) are not Vey star products but are natural star 
products. 


Theorem 12 (Gutt and Rawnsley 2004). Any 
natural star product on a symplectic manifold 
(M,w) determines uniquely 


(i) A symplectic connection V = V( * ). 
(ii) A formal series of closed 2-forms Q= 
Q( * ) € vA2(M)[v]. 

(ui) A formal series E= > ,1vVE, of differential 
operators of order <r + 1 (E of order <2), with 
Eun= Yt, (Eyi VE „u, where the E\*) are 
symmetric contravariant k-tensor fields 


such that 


u * v = exp —E((exp Eu) *y n (exp Ev)) [4] 


We denote * = *v.o. £. If 7 is a diffeomorphism of M 
then the data for rT-* is T- V, T- Q, and 7- E. In 
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particular, a vector field X is a derivation of a 
natural star product x, if and only if yxw=0, 
¥xN=0, zV =0, and y xE—0. 


Group Actions on Star Products 


Symmetries in quantum theories are automorphisms 
of an algebra of observables. In the framework 
where quantization is defined in terms of a star 
product, a symmetry ø of a star product * is an 
automorphism of the Rlv]-algebra C*(M)l[v1 with 
multiplication given by *: 


olu xv) = ol(u) * o(v), eil) 


where c, being determined by what it does on 
C**(M), will be a formal series o(u) = 5°.) v'o,(u) 
of linear maps c, - C*(M) 一 C*(M). We denote by 
Autgr,j( M, * ) the set of those symmetries. 

Any such automorphism c of * then can be 
written as c(u) — T(uoT !), where 7 is a Poisson 
diffeomorphism of (M, P) and T —Id + 35,., v’T, is 
a formal series of linear maps. If * is differential, 
then the T, are differential operators; if * is natural, 
then T=ExpE with E=> svb and E, is a 
differential operator of order at most r + 1. 

If or is a one-parameter group of symmetries of 
the star product *, then its generator D will be a 
derivation of *. Denote the Lie algebra of v-linear 
derivations of x by Dergy,4(M, * ). 

An action of a Lie group G on a star product * on 
a Poisson manifold (M,P) is a homomorphism 
o: G — Autgp( M, * ); then o, = (n) + O(v) and 
there is an induced Poisson action 7 of G on (M, P). 

Given a Poisson action 7 of G on (M,P), a star 
product is said to be “invariant” under G if all the 
(7%) are automorphisms of «. 

An action of a Lie group G on + induces a 
homomorphism of Lie algebras D:q — Derg, 
(M,*). For each £ € a, De=€ 55,4 V Dt, where 
£' is the fundamental vector field on M defined by 7; 
hence, 


d 
£'(x) = Flor (exp — t£)x) 


Such a homomorphism D:q — Derpy,;(M, *) is 
called an action of the Lie algebra q on +. 


Proposition 13 (Arnal et al. 1983). Given D:q— 
Dergp4(M, *) a homomorphism so that for each 
E Eg, De=€ +} p> V Dt, where € are the funda- 
mental vector fields on M defined by an action T of 
G on M and the D; are differential operators, then 
there exists a local homomorphism o:U c G9 
Autry,)(M, > ) so that o, = D. 


If we want the analog in our framework to the 
requirement that operators should correspond to the 
infinitesimal actions of a Lie algebra, we should ask 
the derivations to be inner so that functions are 
associated to the elements of the Lie algebra. 

A derivation D € Dergp,j(M, *) is said to be 
essentially inner or Hamiltonian if D — (1/v)ad, u 
for some 4 € C*(M)[v]. We call an action of a Lie 
group almost «-Hamiltonian if each D; is essentially 
inner; this is equivalent to the knowledge of a linear 
map A 人 :ag— C*(M)lv/] £— Ac so that ad,(1/v) 
[Ags Anla — ad. Are. 

We say the action is *-Hamiltonian if Xe can be 
chosen to make 

aq C*(M) m Ae 
a homomorphism of Lie algebras, where C*(M)lv] 
is endowed with the bracket (1/v)[,],. Such a 
homomorphism is called a quantization in Arnal 
et al, (1983) and is called a generalized moment map 
in Bordemann et al. (1998). 

When a map p?:q— C*(M) is a generalized 
moment map, that is, 


] () 0 0 0 0 
m (ue Za ut) = Pin) 


the star product is said to be covariant under q. 

When a map p:qg — C*(M)lv] is a generalized 
moment map, so that De has no terms in v of 
degree >0, thus D&—£', this map is called a 
quantum moment map (Xu 1998). Clearly in that 
situation, the star product is invariant under the 
action of q on M. 

Covariant star products have been considered to 
study representations theory of some classes of Lie 
groups in terms of star products. In particular, an 
autonomous star formulation of the theory of 
representations of nilpotent Lie groups has been 
given by Arnal and Cortet (1984, 1985). 

Consider a differential star product * on a 
symplectic manifold, admitting an algebra q of vector 
fields on M consisting of derivations of x, and assume 
there is a symplectic connection V which is invariant 
under à; then * is equivalent, through an equivariant 
equivalence (T with 7xT —0), to a Fedosov star 
product *v o; this yields to a classification of such 
invariant star products (Bertelson et al. 1998). 


Proposition 14 (Kravchenko, Gutt and Rawnsley, 
Müller-Bahns, Neumaier, and Hamachi). Consider 
a Fedosov star product xy on a symplectic 
manifold. A vector field X is a derivation of *v.0 if 
and only if vxw-—0,»x0-20, and ¥xV=0. A 
vector field X is an inner derivation of * —*v o if 
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and only if 7xW —0 and there exists a series of 
functions rx such that 


i(X)w = i(X)Q = dix 


In this case X(u) — (1/v)(ad, Ax)(u). 


On a symplectic manifold (M,w), a vector field X 
is an inner derivation of the natural star product 
*=%*y or if and only if ~xV=0,7xE=0, and 
there exists a series of functions Ay such that 


i(X)w — i(X)Q = drx 


Then X =(1/v)ad, ux with ux — Exp(E!)Ax. 

Let G be a compact Lie group of symplecto- 
morphisms of (M,w) and g the corresponding Lie 
algebra of symplectic vector fields on M. Con- 
sider a star product * on M which is invariant 
under G. The Lie algebra q consists of inner 
derivations for * if and only if there exists a series 
of functions Ax and a representative (1/v)(w — 9) 
of the characteristic class of * such that i(X)w— 
i(X)Q = dAx. 

Star products which are invariant and covariant 
are used in the problem of reduction: this is a 
device in symplectic geometry which allows one to 
reduce the number of variables. An important 
Issue in quantization is to know if and how 
quantization commutes with reduction. This pro- 
blem has been studied by Fedosov for the action of 
a compact group on the particular star products 
constructed by him with trivial characteristic class 
( *v.o ). Here, one indeed obtains some “quantiza- 
tion commutes with reduction" statements. More 
generally, Bordemann, Herbig, and Waldmann 
considered covariant star products. In this case, 
one can construct a classical and quantum BRST 
complex whose cohomology describes the algebra of 
observables for the reduced system. While this is 
well known classically — at least under some 
regularity assumptions on the group action — for 
the quantized situation, the nontrivial question is 


whether the quantum BRST cohomology is “as large. 


,5 


as" the classical one. Clearly, from the physical 
point of view, this is crucial. It turns out that 
whereas for strongly invariant star products one 
indeed obtains a quantization of the reduced phase 
space, in general the quantum BRST cohomology 
might be too small. More general situations of 
reduction have also been discussed by, for example, 
Bordemann as well as Cattaneo and Felder, when a 
coisotropic (i.e., first class) constraint manifold is 
given. 


Convergence of Some Star Products 
on a Subclass of Functions 


Let (M,P) be a Poisson manifold and let * be a 
differential star product on it with 1 acting as the 
identity. Observe that if there exists a value k of v 
such that 


oc 
i = $ v” C, (1, v) 


r=0 


converges (for the pointwise convergence of func- 
tions), for al u,v € C*(M), to F,(u,v) in such a 
way that FP, is associative, then F,(u,v) — uv. This is 
easy to see as the order of differentiation in the C, 
necessarily is at least in each argument and thus 
the Borel lemma immediately gives the result. So 
assuming “too much" convergence kills all defor- 
mations. On the other hand, in any physical 
situation, one needs some convergence properties 
to be able to compute the spectrum of quantum 
observables in terms of a star product (as in Bayen 
et al. 1978). 

In the example of Moyal star product on the 
symplectic vector space (R^",u), the formal formula 


V 


(u xm v) (2) = exp (5 Px ) (u(x)v(y)) 


多 一》 一 之 


obviously converges when u and v are polynomials. 
On the other hand, there is an integral formula for 
Moyal star product given by 


21/ , 
X exp € (et EN) + w(E" + €) 
3 «€. )ae d£" 


and this product * gives a structure of associative 
algebra on the space of rapidly decreasing functions 
/(R?"). The formal formula converges (for v — ib) in 
the topology of »;' for u and v with compactly 
supported Fourier transform. 

Some works have been done about convergence of 
star products. 


è The method of quantization of Kahler manifolds 
due to Berezin as the inverse of taking symbols of 
operators, to construct on Hermitian symmetric 
spaces star products which are convergent on a 
large class of functions on the manifold 
(Moreno, Cahen Gutt, and Rawnsley, Karabegov, 
Schlichenmaier). 
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e The constructions of operator representations of 
star products (Fedosov, Bordemann, Neumaier, 
and Waldmann). 

e The work of Rieffel and the notion of strict 
deformation quantization. Examples of strict (Fré- 
chet) quantization have been given by Omori, 
Maeda, Niyazaki, and Yoshioka, and by Bieliavsky. 


Convergence of Berezin-Type Star Products 
on Hermitian Symmetric Spaces 


The method to construct a star product involves 
making a correspondence between operators and 
functions using coherent states, transferring the 
operator composition to the symbols, introducing a 
suitable parameter into this Berezin composition of 
symbols, taking the asymptotic expansion in this 
parameter on a large algebra of functions, and then 
showing that the coefficients of this expansion 
satisfy the cocycle conditions to define a star 
product on the smooth functions (Cahen et al. 
1995). The idea of an asymptotic expansion appears 
in Berezin (1975) and in Moreno and Ortega- 
Navarro (1983, 1986). 

This asymptotic expansion exists for compact M, 
and defines an associative multiplication on formal 
power series in R with coefficients in C*(M) for 
compact coadjoint orbits. For M a Hermitian 
symmetric space of compact type and more gener- 
ally for compact coadjoint orbits (i.e., flag mani- 
folds), this formal power series converges on the 
space of symbols (Karabegov 1998). 

For general Hermitian symmetric spaces of non- 
compact type, using their realization as bounded 
domains, one defines an analogous algebra of 
symbols of polynomial differential operators. 

Reshetikhin and Takhtajan have constructed an 
associative formal star product given by an asymp- 
totic expansion on any Kahler manifold. This they 
do in two steps, first building an associative product 
for which 1 is not a unit element, then passing to a 
star product. 

We denote by (L,V, x a quantization bundle for 
the Kahler manifold (M, w, J) (i.e., a holomorphic 
line bundle L with connection V admitting an 
invariant Hermitian structure 5b, such that the 
curvature is curv(V) =—2imw). We denote by 7 the 
Hilbert space of square-integrable holomorphic 
sections of L which we assume to be nontrivial. 
The coherent states are vectors e; € 7 such that 


s(x) = (s. @g)q; vg E utr E S E M, SE x 
(7 is the ( of the zero section in L). The 


function e(x) = |q|" lle ll^ ,q€ “x, is well defined and 
real e 


Let A:7 — % be a bounded linear operator 
and let 


P Aez, 

A(x) — ate “gee. xcM 
(4,64) 

be its symbol. The function A has an analytic 

continuation to an open neighborhood of the 

diagonal in M x M given by 


(Aeg , eq) 


A(x, y) = Ty. 


’ QE Lx,q Ey 


which is holomorphic in x and antiholomorphic in y. 
We denote by E(L) the space of symbols of bounded 
operators on 7. We can extend this definition of 
symbols to some unbounded operators provided 
everything is well defined. 

The composition of operators on ¥ gives rise to an 
associative product * for the corresponding symbols: 


~~ 


(Ax B)(x) = | Alx, By, nv 9) (y) 2) 


n! 


2 
bey) = L e qd E +x; q € ^y 
leg II" Meal 
is a globally defined real analytic function on 
M x M provided e has no zeros (v(x, y) € 1 every- 
where, with equality where the lines spanned by e; 
and ey coincide). 

Let k be a positive integer. The bundle (L^ = @* L, 
V^. b^) is a quantization bundle for (M, kw, J) and 
we denote by z^ the corresponding space of 
holomorphic sections and by E(L^) the space of 
symbols of linear operators on ¥*. We let e(*) be the 
corresponding function. We say that the quantiza- 
tion is regular if €! is a nonzero constant for all 
non-negative k and if w(x,y)=1 implies x-— y. 
(Remark that if the quantization is homogeneous, 
all e% are constants.) 


Theorem 15 (Cahen et al). Let (M,w, J) be a 
Kahler manifold and (L,V,b) be a regular quan- 
tization bundle over M. Let A,B be in 2, where 
7 C C*(M) consists of functions f which have an 
analytic continuation in MxM so that f(x,y)w 
(x,y)! is globally defined, smooth and bounded on 
K x M and M x K for each compact subset K of M 
for some positive power l. Then 


B(y, x)v (x, y)e 9 k” s y) 


(A x, B) 6) = | Ae) (y) 
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defined for k sufficiently large, admits an asymptotic 
expansion ink ask 一 oo 


~ MEGA, B)(x) 


r>0 


(A x, B)(x 


and the cochains C, are smooth bidifferential 
operators, invariant under the automorphisms of 
the quantization and determined by the geometry 
alone. Furthermore, Co(A, B) - AB and C,(A, B)— 
C1(B, A) = (i/m){A, B). 

If M is a flag manifold, this defines a star product 
on C*(M) and the *, product of two symbols is 
convergent (it is a rational function of k without 
pole at infinity) (cf. Karabegov in tbat generality). 

If 2 be a bounded symmetric domain and & the 
algebra of symbols of polynomial differential opera- 
tors on a homogeneous holomorphic line bundle L 
over Y which gives a realization of a holomorphic 
discrete series representation of Go, tben for f and g 
in € the Berezin product f xg bas an asymptotic 
expansion in powers of R which converges to a 
rational function of k. The coefficients of tbe 
asymptotic expansion are bidifferential operators 
which define an invariant and covariant star product 


on Chg). 


Star Products on Cotangent Bundles 


Since from the physical point of view cotangent 
bundles 7: T*O — O over some configuration space 
O, endowed with their canonical symplectic struc- 
ture wọ, are one of the most important phase spaces, 
any quantization. scheme should be tested and 
exemplified for this class of classical mechanical 
systems. 

We first recall that on T*O there is a canonical 
vector field £, the Euler or Liouville vector field 
which is locally given by €=p,(0/0p,). Here and in 
the following, we use local bundle coordinates 
(q^,p,) induced by local coordinates x* on O. 
Using € we can characterize those functions 
f € C*(T*Q) which are polynomial in the fibers of 
degree k by £f — kf. They are denoted by Pol*(T*Q), 
whereas Pol*(T*Q) denotes the subalgebra of all 
functions which are polynomial in the fibers. 
Clearly, most of the physically relevant observables 
such as the kinetic energy, potentials, and generators 
of point transformations are in Pol'(T*Q). More- 
over, Pol'(T* O) is a Poisson subalgebra with 


[pol (* Q), Pol’(T*Q)} C Pol*^'(T*Q) [s] 


since Yewo 三 wo is conformally symplectic. 


All this suggests that for a quantization of T*O, 
the polynomials Pol'(T*O) should play a crucial 
role. In deformation quantization this is accom- 
plished by the notion of a homogeneous star product 
(De Wilde and Lecomte 1983). If the operator 


a t+ Le 6] 


is a derivation of a formal star product x, then x is 
called homogeneous. It immediately follows that 
Pol(T*O)|v] € C*(T*O)l[v]] is a subalgebra over 
the ring C[v] of polynomials in v. Hence for 
homogeneous star products, the question of conver- 
gence (in general quite delicate) has a simple answer. 

Let us now describe a simple construction of a 
homogeneous star product (following Bordemann 
et al. (1998)). We choose a torsion-free connection 
V on Q and consider the operator of the symme- 
trized covariant derivative, locally given by 


D -dx*vVs44:T9(S*T*Q)r*(s*r*o) [7 


Clearly, D is a global object and a derivation of the 
symmetric algebra (D; a P°(S*T*Q). Let now 
f € Pol'(T*Q) and v € C*(O) be given. Then one 
defines the standard-ordered quantization ostd(f) of f 
with respect to V to be the differential operator 
0sa(f): C*(Q) 一 C*(O) locally given by 


(44 n o Ope, OPE, 


TE. 
(uc), 1 


where i, denotes the symmetric insertation of vector 
fields in symmetric forms. Ago, this is independent 
of the coordinate system x^. The infinite sum is 
actually finite as long as f € Pol’ (T*Q) whence we 
can safely set v — ib in this case. Indeed, [8] is the 
well-known symbol calculus for differential opera- 
tors and it establishes a linear bijection 


Osea : Pol (T'O) — DiffOp(C* (O)) [9] 


which generalizes the usual canonical quantization 
in the flat case of T'O— T*R" =R”. Using this 
linear bijection, we can define a new product xstd for 


Pol*(T*Q) by 
f *sa 8 = 054 lesa (f)osalg)) = C (f,g) [10] 
y 


It is now easy to see that xq fulfills all requirements 
of a homogeneous star product except for the fact 
that the C,(-,-) are bidifferential. In this approach 
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this is far from being obvious as we only worked 
with functions polynomial in the fibers so far. 
Nevertheless, it is true whence xs indeed defines a 
star product for C*(T* Q)[[]]. 

In fact, there is a different characterization of «cq 
using a slightly modified Fedosov construction: first 
one uses V to define a torsion-free symplectic 
connection on T*O by a fairly standard lifting. 
Moreover, using V one can define a standard- 
ordered fiberwise product ost for the formal Weyl 
algebra bundle over T* O, being the starting point of 
the Fedosov construction of star products. With 
these two ingredients one finally obtains xstd from 
the Fedosov construction with the big advantage 
that now the order of differentiation in the C, can 
easily be determined to be r in each argument, 
whence xstd is even a natural star product. More- 
over, C, differentiates the first argument only in 
momentum directions which reflects the standard 
ordering. 

Already in the flat situation the standard ordering 
is not an appropriate quantization scheme from the 
physical point of view as it maps real-valued 
functions to differential operators which are not 
symmetric in general. To pose this question in a 
geometric framework, we have to specify a positive 
density u € P*(|A"|T* Q) on the configuration space 
O first, as for functions there is no invariant 
meaning of integration. Specifying 4 we can con- 
sider the pre-Hilbert space C(Q) with inner 
product 


(6.9) = | dun [11] 
Q 


Now the adjoint with respect to [11] of og4(f) can 
be computed explicitly. We first consider the 
second-order differential operator 


5 S ð 


A ———— TÉ ML a 1 一 -一 12 
agkop, ” iiis pip. H Op, [12] 


where T% „ are the Christoffel symbols of V. In fact, 
A is defined independently of the coordinates and 
coincides with the Laplacian of the pseudo- 
Riemannian metric on TO which is obtained from 
the natural pairing of vertical and horizontal spaces 
defined by using V. Moreover, we need the 1-form 
a defined by Vxu—a(X)u and the corresponding 
vertical vector field o* € l*(T(T*Q)) locally given 
by a” = a4(0/Op,). Then 


0sa(f)! = osa(N?f), N= 


Note that due to the curvature contributions, this 
statement is a highly nontrivial partial integration 
compared to the flat case. Note also that for 


eV 3)( Aa") [13] 


f € Pol(T*O)|v], we have Nf € Pol(T*Q)[v] as 
well, and N commutes with H. As in the flat case 
this allows one to define a Weyl-ordered quantiza- 
tion by 


Oweyi (Ff) = Osa (Nf) [14] 


together with a so-called Weyl-ordered star product 


f *weyt g = N^! (Nf *sa Ng) [15] 


which is now a Hermitian and homogeneous star 
product such that o wey! becomes a *-representation of 
大 Weyl» that is, we have Oweyf * Weyl g) = Oweyi(f) 
O Weyi(g) and Oweyi(f) = Oweyi(f). Note that in the 
flat case this is precisely the Moyal star product *y 
from [1]. 

The star products *sq and *wey have been 
extensively studied by Bordemann, Neumaier, 
Pflaum, and Waldmann and provide now a well- 
understood quantization on cotangent bundles. We 
summarize a few highlights of this theory: 


1. In the particular case of a Levi-Civita connection 
V for some Riemannian metric g and the 
corresponding volume density py, the 1-form 
c vanishes. This simplifies the operator N and 
describes the physically most interesting situation. 

2. If the configuration space is a Lie group G, then its 
cotangent bundle T* G & G x Q* is trivial by using, 
for example, left-invariant 1-forms. In this case the 
star products * wey; and xq restrict to the CBH star 
product on q*. Moreover, * wey; coincides with the 
star product found by Gutt (1983) on T*G. 

3. Using the operator N one can interpolate between 
the two different ordering descriptions og4 and 
0 weyi by inserting an additional ordering parameter 
K in the exponent, that is, Ny = exp(v&(A + a")). 
Thus, one obtains «-ordered representations o, 
together with corresponding «-ordered star pro- 
ducts *,, where «=0 corresponds to standard 
ordering and x= 1/2 corresponds to Weyl order- 
ing. For & — 1, one obtains antistandard ordering 
and in general one has the relation f x; g — g *1_x f 
as well as ox (f) = Bn lf 

4. One can describe also the quantization of an 
electrically charged particle moving in a magnetic 
background field B. This is modeled by a closed 
2-form B eT™(A*T*O) on Q. Using local vector 
potentials A € [°(T*QO) with B= dA locally, and 
by minimal coupling, one obtains a star product 
xg which depends only on B and not on the local 
potentials A. It will be equivalent to x wj; if and 
only if B is exact. In general, its characteristic 
class is, up to a factor, given by the class [B] of 
the magnetic field B. While the observable 
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algebra always exists, a Schrédinger-like repre- 
sentation of xg only exists if B satisfies the usual 
integrality condition. In this case, there exists a 
representation on sections of a line bundle whose 
first Chern class is given by [B]. This manifests 
Dirac’s quantization condition for magnetic 
charges in deformation quantization. Another 
equivalent interpretation of this result is obtained 
by Morita theory: the star products *xwey; and *g 
are Morita equivalent if and only if B satisfies 
Dirac's integrality condition. 

5. Analogously, one can determine the unitary 
equivalence classes of representations for a fixed, 
exact magnetic field B. It turns out that the 
representations depend on the choice of the global 
vector potential A and are unitarily equivalent if 
the difference between the two vector potentials 
satisfies an integrality condition known from the 
Aharonov-Bohm effect. This way, the Aharonov- 
Bohm effect can be formulated within the repre- 
sentation theory of deformation quantization. 

6. There are several variations of the representa- 
tions Øs and oweyl. In particular, one can 
construct a representation on half-forms instead 
of functions, thereby avoiding the choice of the 
integration density u. Moreover, all the Weyl- 
ordered representations can be understood as 
GNS representations coming from a particular 
positive functional, the Schródinger functional. 
For owe, this functional is just the integration 
over the configuration space O. 

7. All the (formal) star products and their represen- 
tations can be understood as coming from formal 
asymptotic expansions of integral formulas. From 
this point of view, the formal representations and 


star products are a particular kind of global 
symbol calculus. 

8. At least for a projectible Lagrangian submanifold 
L of T*Q, one finds representations of the star 
product algebras on the functions on L. This 
leads to explicit formulas for the WKB expansion 
corresponding to this Lagrangian submanifold. 

9. The relation between configuration space symme- 
tries, the corresponding phase-space reduction, 
and the reduced star products has been analyzed 
extensively by Kowalzig, Neumaier, and Pflaum. 


See also: Classical r-Matrices, Lie Bialgebras, and 
Poisson Lie Groups; Deformation Quantization; 
Deformation Quantization and Representation Theory; 
Deformation Theory; Fedosov Quantization; Operads. 
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Introduction 


The 0-approach is one of the most generic methods 
for constructing solutions of completely integrable 
systems. Taking into account that most soliton 
systems are represented as compatibility condition 
for a set of linear differential operators (Lax pairs, 
zero-curvature representations, L-A-B Manakov 
triples), it is sufficient to construct these operators. 


Such compatible families can be defined by present- 
ing their common eigenfunctions. If it is possible to 
show that some analytic constraints imply that a 
function is a common eigenfunction of a family of 
operators, solutions of original nonlinear system are 
also generated. 

The main idea of the 0 method is to impose the 
following analytic constraints: if denotes the 
spectral parameter and x the physical variables, 
then, for arbitrary fixed values x, the 9; derivative 
of the wave function is expressed as a linear 
combination of the wave functions at other values 
of A with x-independent coefficients. In specific 
examples, this property is either derived from the 


direct spectral transform or imposed a priori. Of 
course, the specific realization of this scheme 
depends critically on the nonlinear system. 

The origin of the O-method came from 
the following observation. A solution of the one- 
dimensional _inverse-scattering problem (the 
problem of reconstructing the potential by discrete 
spectrum and scattering amplitude at positive 
energies) for the one-dimensional time-independent 
Schrödinger operator 


L = -—05 + u(x) [1] 


was obtained by Gelfand, Levitan, and Marchenko 
in the 1950s. It essentially used analytic continua- 
tion of the wave function from the real momenta to 
the complex ones. If the potential u(x) decays 
sufficiently fast as |x| — oo, then the eigenfunction 
equation 


Ly(k, x) = k°w(k, x) |2] 


has two solutions v, (b, x) and v. (b, x) such that 


1. Vu (b, x) - (1-- o(1))e^* as x = -oo. 

2. The functions w., (b, x), v» (k,x) are holomorphic 
in k in the upper half-plane and the lower half- 
plane, respectively. 


Existence of analytic continuation to complex 
momenta is typical for one-dimensional systems. But 
in the multidimensional case the situation is differ- 
ent. For example, wave functions for the mutlidi- 
mensional Schródinger operator constructed by 
Faddeev are well defined for all complex momenta 
k, but they are nonholomorphic in k, and they 
become holomorphic only after restriction to some 
special one-dimensional subspaces. The last property 
was one of the key points in the Faddeev approach. 

Beals and Coifman (1981-82) and Ablowitz et al. 
(1983) discovered that departure from holomorphi- 
city for multidimensional wave functions can be 
interpreted as spectral data. Such spectral trans- 
forms proved to be very natural and suit perfectly 
the purposes of the soliton theory., Some other 
famous methods, including the Riemann-Hilbert 
problem, can be interpreted as special reductions of 


the 0 method. 


Nonlocal -Problem and Local -Problem 


The most generic formulation of the d-method is the 
nonlocal 0-problem. Assume that the following data 
is given: 


1. A rational nxn matrix-valued normalization 
function 7(A). 
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2. An nxn matrix-valued function g(A,x) (it 
describes the dynamics) such that 


€ 9o(A, x) depends on the spectral parameter A € C 
and “physical” variables x= (x1,...,xN); 
the physical variables x, are either continuous 
(x, belongs to a domain in R or in C) or 
discrete (x, takes integer values); 

® g(A,x) is analytic in A, defined for all A € C, 
except for a finite number of singular points, 
and is single valued; and 

® detg(A,x) has only finite number of zeros. 


For problems with continuous physical variables the 
typical form of g(A, x) is g(A, x) = exp( >>; x; K;(A)), 
where K;(A) are meromorphic matrices, mutually 
commuting for all A. The discrete variables are 
usually encoded in orders of poles and zeros. 

3. An nxn matrix-valued function R(A, ju) — the 
“generalized spectral data." Usually, it is a regular 
function of four real variables RA, SA, Ru, Su. (We 
write this as a function of two complex variables, 
but we do not assume it to be holomorphic. It 
would be more precise to write it as 及 (入 A, 4, ji), 
but to avoid long notations we omit the A, j 
dependence.) To avoid analytical complications, 
the function R(A, u) is usually assumed to vanish 
as A or u tend to singular points of (A), g(A, x). 


Then the wave function W is defined by the data 
using the following properties: 


1. V —V(A,x) takes values in complex nxn 
matrices: 
V11(À, x) Win(A, x) 
V(A,x) = : : [3] 
Wil A, x) Want A, X) 


2. For all 和 EC outside the singular points, the 
mA), g(À, x) wave function V satisfies the 0-equation 
of inverse-spectral problem, 


QW (A, x R 

il | du^düW(u,x)R(Au) [4 
QA uec 
It is important that condition [4] is x-independent. 

3. The function x(A,x)— (A), where x(A,x)— 
V(A, x)g ^! (4, x), is regular for all A € C and 


xA x)= nA) +0 asp[—oo [S 
The wave function W(\,x) is calculated by 


employing the data (A), g(A, x), R(A, u) using the 
following procedure. Taking into account that the 
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functions 7(A), g(A, x) are holomorphic in A, eqn [4] 
can be rewritten in terms of x(A, x): 


Sr) me Jl... di ^ dg x(u,x)g(A, x) 


x R(A, u)g ! (n, x) [6] 


The right-hand side of [6] is regular; therefore, this 
relation is valid for all complex A values. 

Equation [6] with the boundary condition [5] is 
equivalent to the following integral equation: 


dv ^ dv 
X^ F = mA) T ml y— À 


x | du ^ d x(ux)g(v, x) 
JJ pec 
x R(v,u)g ! (u,x) 7 


This equation can be derived using the generalized 
Cauchy formula. Let f(z) be a smooth (not necessa- 
rily holomorphic) function in a bounded domain D 
in the complex plane. Then 


=] dy 
f) 75.9. — fo) 


+55 /[ 5 DOM 8] 


If the kernel  g(,x)R(Au)g '(u,x) is 
sufficiently good (e.g., it is sufficient to assume, that 
(1 + [AD^ 8g, x) RO, ajg (us x)(1 + ui, € > 0, is 
a continuous function at both finite and infinite 
points), then we have a Fredholm equation (the 
operator on the right-hand side of [7] is compact). 
If it has no unit eigenvalues, eqn [7] is uniquely 
solvable. But, for some values of x, one of the 
eigenvalues may become equal to 1, and it results 
in singularities of solutions. 

If the norm of the integral operator is smaller than 
1, eqn [7] is uniquely solvable. To generate solutions 
that are regular for all values of physical variables, it 
is natural to restrict the class of admissible spectral 
data by assuming the kernel g(A, x) R(A, )g~! (yu, x) 
to be bounded in x for all A, x. In the scalar case 
n= 1, this restriction implies: 


R(A, u) = 0 
for all A, p such that g(A, x)g (yu, x) 


is unbounded in x [9] 


For specific examples like the Kadomtsev-Petviashvili-Il 
(KP-II), direct scattering transform automatically 
generates spectral data satisfying [9]. In KP-II, [9] 
implies 


R(A.u) = A(A)6(A — p) + T(A)ó(A — n) — [10] 


The coefficient A(A) can be eliminated by multi- 
plying the wave function to an appropriate function 
of A; therefore, in standard texts, A(A) = 0 

If for every A the kernel R(A, x) is equal to 0 
everywhere except at finite number of points 
pA(3),..., H4 (4), one has the so-called local 
O-problem. Such kernels are rather typical. 


Examples of Soliton Systems Integrable 
by the 0-Problem Method 


Let us discuss some important examples. 


The KP-II Hierarchy 


The first nontrivial equation from the KP hierarchy 
has the following form: 


(uy + 6uuy 一 Uxxx), = = 3" Uyy [11] 


From a physical point of view, the case of real o? is 
the most interesting one. Equation [11] is called 
KP-I if o? = —1 and KP-II if o? = +1. The Lax pair 
for KP-II reads: 


[L;A] — 0 
where 


L; — Oy 23 Br T u(x, y, t) 
A=0,- 40; + 6u(x, y, t)Ox [12] 
+ 3ux (xX, y, t) + 3w(x, y, t) 


The Cauchy problem for initial data u(x, y, 0) 
decaying at infinity is solved by using the nonlocal 
Riemann problem for KP-I and local -problem for 
KP-II. The wave function is assumed to be scalar 
valued (n= 1), and -equation [4] takes the follow- 
ing form: 


nt) = T(A)W(A, x, y, f) [13] 


The wave function VW(A,x,y,f) is assumed to be 
regular for finite A's and to have the following 
essential singularity as \ — oc: 


V(A,x,y,t) = exp(Ax + My 4- 44°2)(1 4+ 0(1)) [14] 


Equivalently, 7(A) = 1 and the function g(A, x,t) has 
one essential singularity at À — oo, 


g(A, x, t) = exp(Ax + A*y + 472) [15] 


Higher times £, from the KP hierarchy are 
incorporated into this scheme by assuming that 


t) = exp (Š va) [16] 
k=1 


Here x=, y—15,t—413. 

Equation [13] was originally derived (Ablowitz 
et al. 1983) from the direct spectral transform. If the 
potential u(x,y) is sufficiently small and 
u(x, 'y) = O(1/(x? + y2)^*) for x?-- y? — oo, then 
the wave function V(A, x, y) for the L-operator [12] 


LW(A,x,y) = 0 
P(A x,y) =expAx+2y)[1+o0(1)] — [17 


for x^ 4- y? — oo 


can be constructed by solving the following integral 
equation for the pre-exponent x(A,x, y) — V(A,x, y) 
exp(-Ax 一 A?y): 


x(4,x,y) = 1— f[ co. —x',y—y')u(x, y) 
x x(A, x, y) dx dy’ [18] 


where the Green function G(A, x, y) is given by 


GOA pxx+pyy) d d 19 
X, y) =- /i Px Py [ | 


It is not holomorphic in A, but 


OG(A,x,y) o | 2i Rx — iR AQ 
33 = 5, Sen(hA) e [20] 
The nonholomorphicity of G(A,x,y) results in 
the special nonholomorphicity of W(A,x,y) of the 
form [13]. 


Remark We see that one function of two real 
variables T(A) is sufficient to solve the Cauchy 
problem in the plane. But it is also possible to 
construct solutions of KP-II starting from generic 
nonlocal kernels R(A, jz) (to guarantee at least local 
existence of solutions, it is enough to assume that 
R(A, 4) is small and has finite support). It looks 
like a paradox, but the situation is exactly the 
same in the linear case. In the standard Fourier 
method, only exponents with real momenta are 
used, but local solutions can be constructed as 
combinations of exponents with arbitrary complex 
momenta. 


Novikov-Veselov Hierarchy and Two-Dimensional 
Schródinger Operator 


Equations from this hierarchy admit representation 
in terms of Manakov L-A-B triples, 
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OL 


Ar = |A,, L| + B4L [21] 


where 
L = —40,0; + u(z, t) 
A, cC gent (eer 下 p i 


The order of B, is smaller than 27 + 1. In particular, 
for 1 — 1, 


[22] 


A; = 8(0? + 02) + 2(wô; + wô:z) 


_ [23] 
B; = tU, + Wz 


Ut = 807u + 802 u + 20s(uw) + 20,(uw) [24] 
where 


u(z,t) = u(z,t), Q,w(z,t) = 


This hierarchy is integrated using the scattering 
transform at zero energy for the two-dimensional 
Schrodinger operator L. If Cauchy data with 
asymptotic 


u(z) ET — E», 


is considered, the scattering transform for the 
operator L= L + Eo with the potential z(z) — u(z) + 
Eo at fixed energy Eo and decaying at infinity is used. 
In fact, the fixed-energy scattering problem is one of 
the basic problems of mathematical physics, and the 
Novikov-Veselov hierarchy can be treated as an 
infinite-dimensional Abel symmetry algebra for this 
problem. The scattering transform essentially depends 
on the sign of Eo. The case Eo — 0, studied by Boiti, 
Leon, Manna, and Pempinelli is the most complicated 
from the analytic point of view, and we do not 
discuss it now. 

If Eo < 0, the wave function satisfies a pure local 
O-relation: 


—30zu(z, t) [25] 


w(z) — 0, for |z| 一 co [26] 


ER = TOV (3 | z) [27] 


with 7(A) = 1, and 


g(A, z) — elt /2)A2+2/A) Ke Ped E [28] 


Starting from generic spectral data T(A), one obtains 
a fixed-energy eigenfunction for a second-order 
operator, 


LV(A,z) = EQW(A, z) 


. [29] 
L = —40,0; + V(z)O, + u(z) 


To generate pure potential operators (V(z) = 0), it is 
necessary to impose additional symmetry constraints 
of the spectral data (see the section “Reductions on 
the 0-data”). 
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If Eo > 0, there are two types of generalized 
spectral data -— D-data and nonlocal Riemann 
problem data. The wave function satisfies the 
O-relation: 


OW(A, z) 1 

———--— T(A)V{| -—=,z), |A| #1 30 
O2-row(-xs). MN#1 BO 

and has a jump at the unit circle |À|— 1. The 

boundary values Wi(A,z)=V(A(1+0),z) are 

connected by the following relation: 


V0.2) - 9 0.2) P R(A,u)V-(u,z)dyu| [31] 


|u|-1 


g(A, z) R3 elle/2)(AZ+2/A) 


K? = Eo [32] 

Constraints on the spectral data associated with 
pure potential operators were found by Novikov 
and Grinevich for R(A,j), and by Manakov and 
Grinevich for T(A). Existence of two different types 
of generalized scattering data has a very transparent 
physical meaning: there is a one-to-one correspon- 
dence between the classical scattering amplitude at 
energy Eo and the nonlocal Riemann problem data 
R(A,u) The D-data T(A) can be treated as a 
complete set of additional parameters enumerating 
all potentials with a given scattering amplitude at 
one energy. 


Davey-Stewartson-ll and Ishimori-1 Equations 


The Davey—Stewartson-II (DS-II) equation 


ig +2(02 + 02)q+(g+a)q=0 — [33] 


à;g = —K0,|q)? [34] 


q—q(zt),  g-s(ath  z-x-iy [35] 


can be treated as an integrable (2 + 1)-dimensional 
extension of nonlinear Schrödinger equation. The 
Ishimori-I equation 


IS +S x (RS + HS) +AwS+I,waS=0 [36] 
Ow — Ow + 28(0,8 x 0,8) = 0 [37] 


S — S(x,y,t), S= (81,82,83), 


is an integrable (2 4- 1)-dimensional extension of the 
Heisenberg magnetic equation. Both systems are 


S?—1 [38] 


integrated by using the following zero-curvature 
representation: 


b Jes "9. [39] 
0 ð, 2i&q(z t) 0 


2i0 + ig iq; — iqÓ; 
-ikj,-Likqü, —2i02 — ig 


The wave function satisfies the following “scatter- 
ing" equation: 


a% 0 0 Kb(k) 
Cue Te e 
0 dO b(k) 0 


Here VT denotes the transposed matrix. Let us point 
out the amazing symmetry between the direct and 
inverse transforms. 


Discrete Systems 


In the examples discussed above, continuous vari- 
ables are “encoded” in essential singularities of 
g(A,x). Discrete variables correspond to orders of 
zeros and poles. For example, assuming that the 
function g(A,t) in the KP integration scheme 
depends on extra continuous variables £ 1,1 5,..., 
t_,,... and discrete variable to =n, 


g(A, t) = A” exp » ta) [42] 


kz#0 


one obtains solutions of the so-called two-dimensional 
Toda-KP hierarchy. 

Assume that we have a nonlocal -problem for a 
scalar function with 7 = 1 and 


k E "nj 
g(A.m,.. . , Np) = (+= ax) |43] 
j=1 


The wave function defines a map yb B 


(n1,...,n4,) — (V(M,m,...,n4),..., 
亚 (AN; 71... ,?14)) [44] 


where 1,...,AN are some points in C. This 
construction generates the so-called quadrilateral 
lattices (each two-dimensional face is planar). 


Multidimensional Problems 


The 0-approach can also be applied to multidimen- 
sional inverse-scattering problems, but typically the 
scattering data are overdetermined and satisfy 
additional nonlinear compatibility conditions. For 


example, the Faddeev wave functions for the 
n-dimensional stationary Schfodinger operator 


(—2, R2 u(x) ) U(k,x) —(k-k)w(k,x) [45] 
W(k, x) =e**(1 + o(1)) [46] 


in the nonphysical domain kı Æ 0 (kg and kr denote 
the real and imaginary parts of k, respectively) 
satisfy the following 0-equation: 


OW(k,x) | ss. | 
SE M ML LII 
x &(E -€ + 2k - £)dés --- dén 47 


The characterization of admissible spectral data 
b(k,D,ke C",Ic R” is based on the following 
compatibility equation: 


Ob(k,l) | 10b(k,I) 


Ok; 2 ol; 
ide | Eb(k, kr +E)blk +E, 1) 
eR” 
x 6(E-E+2k-€)dé --- dé, 48] 


More details can be found in Novikov and Henkin 
(1987). 


Reductions on the 0-Data 


The most generic Ó-data usually result in solutions 
from wrong functional class (they may, e.g., be 
complex or singular), or extra constraints on the 
auxiliary linear operators are necessary to obtain 
solutions of the zero-curvature representation. For 
example, to obtain real KP-II solutions using the 
local 0-problem [13], the following reduction on the 
ð-data should be implied: 


T(À) = - T(A) [49] 


It can be easily derived from the direct transform. 
But it is not always the case. For example, the 
selection of pure potential two-dimensional 
Schrödinger operators originally was not so evident. 
To formulate the answer, it is convenient to 
introduce a new function b(A), T(A) — b(A)x 
sgn(AA — 1)/A. 

For Eo < 0, the following constraints select real 
potential operators: 


b(- 3 = b(A), (5) =b) [50 


In some situations, the problem of finding appro- 
priate reductions is the most difficult part of the 
integration procedure. It is true not only for the 
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O-approach, but also for other techniques including 
the finite-gap method. For example, the inverse- 
spectral transform for the two-dimensional 
Schródinger operator was first developed for 
finite-gap (quasiperiodic) potentials and only later 
for the decaying ones. For operators with finite gap 
at one energy the first-order terms were constructed 
by Dubrovin, Krichever, and Novikov in 1976, but 
only in 1984 the potentiality reduction was found by 
Novikov and Veselov. 


Nonsingular Solutions 


As mentioned above, one can construct regular 
solutions by choosing sufficiently small (in an 
appropriate norm) scattering data. But for some 
special systems the regularity follows automatically 
from reality reductions. For example, for arbitrary 
large 9-data, real KP-II solutions constructed by the 
local -problem [13] with reduction [49] are regular. 
The proof is based on the theory of generalized 
analytic functions (in the Vekua sense). Another 
example is the two-dimensional Schrodinger opera- 
tor at a fixed negative energy Eo < 0. The potenti- 
ality and reality constraints imply that the potential 
is nonsingular for arbitrary large T(A). But, unfortu- 
nately, the 0-problem with regular data covers only 
a part of the space of potentials. In fact, each such 
operator possesses a strictly positive real eigenfunc- 
tion at the level Eo, exponentially growing in all 
directions (it also follows from the generalized 
analytic functions theory). Existence of such func- 
tion implies that the whole discrete spectrum is 
located above the energy Eo, and it gives a 
restriction on the potential. (For more details, see 
the review by Grinevich (2000).) 


Some Explicit Solutions 


The generic Ó-problem results in potentials that 
could not be expressed in terms of elementary or 
standard special functions. But for degenerate 
kernels, a solution of the inverse-spectral problem 
can be written explicitly. For example, if 


R(A, u) = M ' re(A)se(H) [51] 
k=l 


the wave function and solutions can be expressed in 
quadratures. 

In particular, if all r,(A) and s,(j) are 6-functions, 
r((A)— R46(A — Ag) and s,(u)—S,ó(u — uy), the 
wave function is rational in 和 and can be expressed 
as a rational combination of exponents of x,. If for 
some k and l, A, — jj, this procedure needs some 
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regularization. For example, it is possible to assume, 
that 6(A — Ao) /(A — Ao) = 0. 

If for all k, à=, the O-problem generates 
rational in x solutions (lumps). It is possible to show 
that, the Novikov—Veselov real rational solitons for 
Eg > 0 are always nonsingular, decay at oo as 
1/(x? + y*), and the potential u(z) has zero scatter- 
ing in all directions for the waves with energy Eo. 


The -Problem on Riemann Surfaces 


In all examples discussed above, the spectral vari- 
able is defined in a Riemann sphere. It is natural to 
generalize it by considering wave functions depend- 
ing on a spectral parameter defined on a Riemann 
surface of higher genus. Spectral transforms of such 
type arise in the theory of localized perturbation of 
periodic solutions. Assume that the KP-II potential 
u(x, y) has the form 


u(x, y) = uo(x. y) + u (x,y) [52] 


where uo(x,y) is a real nonsingular finite-gap 
potential and ui(x,y) decays sufficiently fast at 
infinity. Denote by Yoly, x, y) the wave function of 
the operator Lo = 0, — 02 + uo(x, y), where 7 € T, 
the spectral curve T is a compact Riemann surface of 
genus g with a distinguished point oc. In addition to 
essential singularity at the point oo, the wave 
function Yoly, x,y) has g simple poles at points 
Yis---Yg and is holomorphic in ^ outside these 
singular points. For a real nonsingular potential, T is 
an M-curve, that is, there exists an antiholomorphic 
involution rt: — DL,roo = oo, the set of fixed 
points form g+ 1 ovals 40,...,45,00 € do, YR € ag. 
The wave function W(y,x,y) of the perturbed 
operator L=0,—02+u(x,y) is defined at the 
same spectral curve T, but it is not holomorphic 
any more. It has the following properties: 


1. At the point oo, the wave function WV(^, x, y) has 
an essential singularity: (y, x,y) — Volly, x, y) 
(1 + o(1)). 

2. In the neighborhoods of the points yk, V(y, x, y) 
can be written as a product of a continuous 
function by a simple pole at yg. f 

3. The wave function W(y,x,y) satisfies the 0 
equation 


OW (y,x,y,t) ,. 
DEOS s = T(y)W(ry,x.y,t) [53] 
OY 

where the (0, 1)-form T(y)=t(y)dy is regular 
outside the divisor points ^; and in the neighbor- 
hood of ^, it possible to define local coordinate 


such that £() = sgn(S4)t1(o)6y — wk) /(Y — w) ti (7) 
is regular. 


A solution of the inverse problem can be obtained 
by using appropriate analogs of Cauchy kernels on 
Riemann surfaces. 


Quasiclassical Limit 


The systems integrable by the O-method usually 
describe integrable systems with high-order deriva- 
tives. It is well known that by applying some 
limiting procedures to integrable systems one can 
construct new completely integrable equations, but 
integration methods for these equations are based on 
completely different analytic tools. One of most 
important examples is the theory of dispersionless 
hierarchies. The limiting procedure for the 0- 
problem (quasiclassical 0-problem) was developed 
by Konopelchenko and collaborators. In the KP 
case, the quasiclassical limit of the wave function 
W(A, f) is assumed to have the following form: 


v(d,-) = X(A,t, c) exp (322) [54] 


€ 


It is possible to show that the function S(A,t) 
satisfies a Beltrami-type equation: 


S(A,t) |— S(A, t) 
which is treated as a dispersionless limit of [13]. 


Higher-order corrections were also discussed in the 
literature (see Konopelchenko and Moro (2003)). 


See also: Boundary-Value Problems for Integrable 
Equations; Integrable Systems and Algebraic Geometry; 
Integrable Systems and the Inverse Scattering Method; 
Integrable Systems: Overview; Integrable Systems and 
Discrete Geometry; Riemann-Hilbert Methods in 
Integrable Systems. 
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Introduction 


In this article we shall briefly outline derived 
categories and their relevance for physics. Derived 
categories (and their enhancements) classify off-shell 
states in a two-dimensional topological field theory 
on Riemann surfaces with boundary known as the 
open-string B model. We briefly review pertinent 
aspects of that topological field theory and its 
relation to derived categories, the Bondal-Kapranov 
enhancement and its relation to the open-string B 
model, as well as B model twists of two-dimensional 
theories known as Landau-Ginzburg models, and 
how information concerning stability of D-branes is 
encoded in this language. We concentrate on more 
physical aspects of derived categories; for a very 
readable short review concentrating on the mathe- 
matics, see, for example, Thomas (2000). 


Sheaves and Derived Categories 
in the Open-String B Model 


Derived categories are mathematical constructions 
which are believed to be related to D-branes in the 
open-string B model. We shall begin by briefly 
reviewing the B model, as well as D-branes. 

The A and B models are two-dimensional topolo- 
gical field theories, closely related to nonlinear 
sigma models, which are supersymmetrizations of 
theories summing over maps from a Riemann 
surface (the world sheet of the string) into some 
"target space" X. In both the A and B models, one 
considers only certain special correlation functions, 
involving correlators closed under the action of a 
nilpotent scalar operator known as the “BRST 
operator," O, which is part of the original super- 
symmetry transformations. In considering the perti- 
nent correlation functions, only certain types of 
maps contribute. The A model has the properties of 
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being invariant under complex structure deforma- 
tions of the target space X, and its pertinent 
correlation functions are. computed by summing 
over holomorphic maps into the target X. The A 
model will not be relevant for us here. The B model 
has the properties of being invariant under Kähler 
moduli of the target X, and its pertinent correlation 
functions are computed by summing over constant 
maps into the target X. In the closed-string B model, 
the states of the theory are counted by the 
cohomology groups H*(X, A* TX), where X is con- 
strained to be Calabi-Yau. The BRST operator in 
the B model O can be identified with 0 for many 
purposes. The open-string B model is the same 
topological field theory, but now defined on a 
Riemann surface with boundaries. As with all 
open-string theories, we specify boundary conditions 
on the fields, which force the ends of the string to 
live on some submanifold of the target, and we 
associate to the boundaries degrees of freedom 
(known as the Chan-Paton factors) which describe 
a (possibly twisted) vector bundle over the submani- 
fold. In the case of the B model, the submanifold is a 
complex submanifold, and the vector bundle is 
forced to be a holomorphic vector bundle over that 
submanifold. 

To lowest order, that combination of a submani- 
fold S of X together with a (possibly twisted) 
holomorphic vector submanifold, is a *D-brane" in 
the open-string B model. We shall denote the twisted 
bundle by £& Ks, where Ks is the canonical 
bundle of S, and the VKs factor is an explicit 
incorporation of something known as the Freed- 
Witten anomaly. Now, if ;: $— X is the inclusion 
map, then to this D-brane we can associate a 
sheaf i,£. 

Technically, a sheaf is defined by associating sets, 
or modules, or rings, to each open set on the 
underlying space, together with restriction maps 
saying how data associated to larger open sets 
restricts to smaller open sets, obeying the obvious 
consistency conditions, together with some gluing 
conditions that say how local sections can be 
patched back together. A vector bundle defines a 
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sheaf by associating to any open set sections of the 
bundle over that open set. Sheaves of the form “i£” 
look like, intuitively, vector bundles over submani- 
folds, with vanishing fibers off the submanifold. 
A more detailed discussion of sheaves is beyond the 
scope of this article; see instead, for example, Sharpe 
(2003). 

To “associate a sheaf” means finding a sheaf such 
that physical properties of the D-brane system are 
well modeled by mathematics of the sheaf. (In 
particular, the physical definition of D-brane has, 
on the face of it, nothing at all to do with the 
mathematical definition of a sheaf, so one cannot 
directly argue that they are the same, but one can 
still use one to give a mathematical model of the 
other.) For example, the spectrum of open-string 
states in the B model stretched between two 
D-branes, associated to sheaves i,€ and jF, turn 
out to be calculated by a cohomology group known 
as Ext (iE, jF). 

There are many more sheaves not of the form i,é, 
that is, that do not look like vector bundles over 
submanifolds. It is not known in general whether 
they also correspond to (on-shell) D-branes, but in 
some special cases the answer has been worked out. 
For example, structure sheaves of nonreduced 
schemes turn out to correspond to D-branes with 
nonzero nilpotent Higgs vevs. 

For a set of ordinary D-branes, the description 
above suffices. However, more generally one would 
like to describe collections of D-branes and anti- 
D-branes, and tachyons. An anti-D-brane has all 
the same physical properties as an ordinary D- 
brane, modulo the fact that they try to annihilate 
each other. The open-string spectrum between 
coincident D-branes and anti-D-branes contains 
tachyons. One can give an (off-shell) vacuum 
expectation value to such tachyons, and then the 
unstable  brane-antibrane-tachyon system will 
evolve to some other, usually simpler, configura- 
tion. For example, given a single D-brane wrapped 
on a curve, with trivial line bundle, and an anti-D- 
brane wrapped on the same curve, with line bundle 
O(—1), and a nonzero tachyon O(—1) — O, then 
one expects that the system will dynamically evolve 
to a smaller D-brane sitting at a point on the curve. 

Now, one would like to find some mathematics 
that describes such systems, and gives information 
about the endpoints of their evolution. Techni- 
cally, one would like to classify universality classes 
of world-sheet boundary renormalization group 
flow. 

It has been conjectured that derived categories of 
sheaves provide such a classification. To properly 
explain derived categories is well beyond the scope 


of this article (see instead the “Further reading” 
section at the end), but we shall give a short outline 
below. 

Mathematically, derived categories of sheaves 
concern complexes of sheaves, that is, sets of 
sheaves £; together with maps d;:£; — £;,4 

hy S pu ni NM Zr 
such that dj,;0d;=0. A category is defined by a 
collection of “objects” together with maps between 
the objects, known as morphisms. In a derived 
category of coherent sheaves, the objects are com- 
plexes of sheaves, and the maps are equivalence 
classes of maps between complexes. 

Physically, if the complex consists of locally free 
sheaves (equivalently, vector bundles), then we can 
associate a brane/antibrane/tachyon system, by iden- 
tifying the £; for i even, say, with D-branes, and the 
€; for i odd with anti-D-branes. If the E; are all 
locally free sheaves, then there are tachyons between 
the branes and antibranes, and we can identify the 
d;s with those tachyons. In the open-string world- 
sheet theory, giving a tachyon a vacuum expectation 
value modifies the BRST operator O, and a necessary 
condition for the new theory to still be a topological 
field theory is that OQ? — 0, a condition which turns 
Out to imply that "NP o d; =}, 

To re-create the structure of a derived category, 
we need to impose some equivalence relations. To 
see what sorts of equivalence relations one would 
like to impose, note the following. Physically, we 
would like to identify, for example, a configuration 
consisting of a brane, an antibrane, and a tachyon, 
which we can describe as a complex 


O(—D) —O 
with a one-element complex 
Op 


corresponding to the D-brane which we believe is 
the endpoint of the evolution of the brane/antibrane 
configuration. 

One natural mathematical way to create identifi- 
cations of this form is to identify complexes that 
differ by “quasi-isomorphisms,” meaning, a set of 
maps (f":C"— D") compatible with d’s, and 
inducing an isomorphism /":H"(C) 2 H"(D) on 
the cohomologies of the complexes. In particular, 
in the example above, there is a natural set of maps 


O(-D) —— O 


| 


0 


Op 


that define a quasi-isomorphism. More generally, in 
homological algebra, one typically does computations 
by replacing ordinary objects with projective or 
injective resolutions, that is, complexes with special 
properties, in which the desired computation 
becomes trivial, and defining the result for the 
original object to be the same as the result for the 
resolution. To formalize this procedure, one would 
like a mathematical setup in which objects and their 
projective and injective resolutions are isomorphic. 

However, to define an equivalence relation, one 
usually needs an isomorphism, and the quasi- 
isomorphisms above are not, in general, isomorphisms. 
Creating an equivalence from nonisomorphisms, 
to resolve this problem, can be done through a 
process known as “localization” (generalizing the 
notion of localization in commutative algebra). 
The resulting equivalence relations on maps 
between complexes define the derived category. 

The derived category is a category whose objects 
are complexes, and whose morphisms C — D' are 
equivalence classes of pairs (s,t) where s: G' — C is 
a quasi-isomorphism between C and another com- 
plex G, and t:G —D is a map of complexes. We 
take two such pairs (s, t), (s, ^) to be equivalent if 
there exists another pair (r, Ph) between the auxiliary 
complexes G,G", making the obvious diagram 
commute. This is, in a nutshell, what is meant by 
localization, and by working with such equivalence 
classes, this allows us to formally invert maps that 
are otherwise noninvertible. (We encourage the 
reader to consult the “Further reading" section for 
more details.) 

Mathematically, this technology gives a very 
elegant way to rethink, for example, homological 
algebra. There is a notion of a derived functor, a 
special kind of functor between derived categories, 
and notions from homological algebra such as Ext 
and Tor can be re-expressed as cohomologies of the 
image complexes under the action of a derived 
functor, thus replacing- cohomologies with 
complexes. 

Physically, looking back at the physical realization 
of complexes, we see a basic problem: different 
representatives of (isomorphic) objects in the derived 
category are described by very different physical 
theories. For example, the sheaf Op corresponds to a 
single D-brane, defined by a two-dimensional 
boundary conformal field theory (CFT), whereas 
the brane/antibrane/tachyon collection O( — D) —^ O 
is defined by a massive nonconformal two- 
dimensional theory. These are very different physical 
theories. If we want “localization on quasi- 
isomorphisms” to happen in physics, we have to 
explain which properties of the physical theories we 
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are interested in, because clearly the entire physical 
theories are not and cannot be isomorphic. 

Although the entire physical theories are not 
isomorphic, we can hope that under renormalization 
group flow, the theories will become isomorphic. 
That is certainly the physical content of the statement 
that the brane/antibrane system O(—D)—O should 
describe the D-brane corresponding to Op - after 
world-sheet boundary renormalization group flow, 
the nonconformal two-dimensional theory describing 
the brane/antibrane system becomes a CFT describing 
a single D-brane. 

More globally, this is the general prescription for 
finding physical meanings of many categories: we 
can associate physical theories to particular types 
of representatives of isomorphism classes of 
objects, and then although distinct representatives 
of the same object may give rise to very different 
physical theories, those physical theories at least lie 
in the same universality class of world-sheet 
renormalization group flow. In other words, 
(equivalence classes of) objects are in one-to-one 
correspondence with universality classes of physical 
theories. 

Showing such a statement directly is usually not 
possible — it is usually technically impractical to 
follow renormalization group flow explicitly. There 
is nO symmetry reason or other basic physics reason 
why renormalization group flow must respect quasi- 
isomorphism. The strongest constraint that is clearly 
applied by physics is that renormalization group 
flow must preserve D-brane charges (Chern char- 
acters, or more properly, K-theory), but objects in a 
derived category contain much more information 
than that. 

However, indirect tests can be performed, and 
because many indirect tests are satisfied, the result is 
generally believed. 

The reader might ask why it is not more efficient 
to just work with the cohomology complexes 
H (C) themselves, rather than the original com- 
plexes. One reason is that the original complexes 
contain more information than the cohomology - 
passing to cohomology loses information. For 
example, there exist examples of complexes that 
have the same cohomology, yet are not quasi- 
isomorphic, and so are not identified within the 
derived category, and so physically are believed 
to lie in different universality classes of boundary 
renormalization group flow. 

Another motivation for relating physics to derived 
categories is Kontsevich's approach to mirror sym- 
metry. Mirror symmetry relates pairs of Calabi-Yau 
manifolds, of the same dimension, in a fashion such 
that easy classical computations in one Calabi-Yau 
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are mapped to difficult “quantum” computations 
involving sums over holomorphic curves in the other 
Calabi-Yau. Because of this property, mirror sym- 
metry has proved a fertile ground for algebraic 
geometers to study. Kontsevich proposed that mirror 
symmetry should be understood as a relation 
between derived categories of coherent sheaves on 
one Calabi-Yau and derived Fukaya categories on 
the other Calabi-Yau. At the time he made this 
proposal, no one had any idea how either could be 
realized in physics, but since that time, physicists 
have come to believe that Kontsevich was secretly 
talking about D-branes in the A and B models. 


Bondal-Kapranov Enhancements 


Mathematically derived categories are not quite as 
ideal as one would like. For example, the cone 
construction used in triangulated categories does not 
behave functorially — the cone depends upon the 
representative of the equivalence class defining an 
object in a derived category, and not just the object 
itself. 

Physically, our discussion of brane/antibrane 
systems was not the most general possible. One 
can give vacuum expectation values to more general 
vertex operators, not just the tachyons. 

Curiously, these two issues solve each other. By 
incorporating a more general class of boundary vertex 
operators, one realizes a more general mathematical 
structure, due to Bondal and Kapranov, which repairs 
many of the technical deficiencies of ordinary derived 
categories. Ordinary complexes are replaced by gen- 
eralized complexes in which arrows can map between 
non-neighboring elements of the complex. Schemati- 
cally, the BRST operator is deformed by boundary 
vacuum expectation values to the form 


QO=0+) da 


and demanding that the BRST operator square to 
zero implies that 


>》 Oba +> - bb ° ba = 0 
a a,b 


which is the same as the condition for a generalized 
complex. Note that for ordinary complexes, the 
condition above factors into 


P 


Op =) 
Qn41 0 Pn = 0 


which yields an ordinary complex of sheaves 
(Figure 1). 


[ ) 0 0 0 0 0 
2 F por 
3 


Figure 1 1. Example of generalized complex. Each arrow is 
labeled by the degree of the corresponding vertex operator. 


Landau-Ginzburg Models 


So far we have described how derived categories are 
relevant to geometric compactifications, that is, 
sigma models on Calabi-Yau manifolds. However, 
there are also “nongeometric” theories — CFTs that 
do not come from sigma models on manifolds, of 
which Landau-Ginzburg models and their orbifolds 
are prominent examples. Landau-Ginzburg models 
can also be twisted into topological field theories, 
and the B-type topological twist of (an orbifold of) a 
Landau-Ginzburg model is believed to be iso- 
morphic, as a topological field theory, to the B 
model obtained from a nonlinear sigma model, of 
the form we outlined earlier. Landau-Ginzburg 
models have a very different form than nonlinear 
sigma models, and so sometimes there can be 
practical computational advantages to working 
with one rather than the other. 

A Landau-Ginzburg model is an ungauged sigma 
model with a nonzero superpotential (a holo- 
morphic function over the target space that defines 
a bosonic potential and Yukawa couplings). (In 
"typical" cases, the target space is a vector space.) 
Because of the superpotential, a Landau-Ginzburg 
model is a massive theory — not itself a CFT, but 
many Landau-Ginzburg models are believed to flow 
to CFTs under the renormalization group. 

In formulating open strings based on Landau- 
Ginzburg models, naive attempts fail because of 
something known as the Warner problem: if the 
superpotential is nonzero, then the obvious ways to 
try to define the theory on a Riemann surface with 
boundary have the undesirable property that the 
supersymmetry transformations only close up to a 
nonzero boundary term, proportional to derivatives 
of the superpotential. In order to find a description 
of open strings in which the supersymmetry trans- 
formations close, one must take a very nonobvious 
formulation of the boundary data. Specifically, to 
solve the Warner problem, one is led to work with 
pairs of matrices whose product is proportional to 
the superpotential. 

This method of solving the Warner problem is 
known as matrix factorization, and D-branes in this 
theory are defined by the factorization chosen, that 
is, the choice of pairs of matrices. In simple cases, 
we can be more explicit as follows. Choose a set of 


polynomials Fa, Ga such that the Landau-Ginzburg 
superpotential W is given by 


W = A. F4G,, + constant. 


The F, and G, are used to define the boundary 
action — the F’s appear as part of the boundary 
superpotential and the G's appear as part of the 
supersymmetry transformations of boundary fermi 
multiplets. The F, and G,, that is, the factorization 
of W, determine the D-brane in the Landau- 
Ginzburg theory. We can also think of having a 
pair of holomorphic vector bundles £;,£5 of the 
same rank, and interpret F and G as holomorphic 
sections of £1 & E2 and £5 & £1, respectively, obey- 
ing FG x W-Id and GF x W -Id, up to additive 
constants. 

Although a Landau-Ginzburg model is not the 
same thing as a sigma model on a Calabi-Yau, 
orbifolds of Landau-Ginzburg models are often on 
the same Kahler moduli space. Perhaps, the most 
famous example of this relates sigma models on 
quintic hypersurfaces in P^ to a Zs orbifold of a 
Landau-Ginzburg model over C? with five chiral 
superfields x1,x2,x3, x4, xs, and a superpotential of 
the form 


eS viel wad iol ZEN. 
W — x1 +x x3 x44 x5 
+ 1X1X2X3X4X53 


for » a complex number, corresponding to the 
equation of the degree-5 hypersurface in P^. The 
(complexified) Kahler moduli space in this example 
is a P', with the sigma model on the quintic at one 
pole, the zero-volume limit of the sigma model along 
the equator, and the Landau-Ginzburg orbifold at 
the opposite pole. 

Since the closed-string topological B model is 
independent of Káhler moduli, and the sigma model 
on the quintic and the Landau-Ginzburg orbifold 
above lie on the same Kahler moduli space, one 
would expect them both to have the same spectrum 
of D-branes, and indeed this is believed to be true. 


Pi-Stability 


So far we have discussed D-branes in the topological 
B model, a topological twist of a physical sigma 
model. If we untwist back to a physical sigma 
model, then the stability of those D-branes becomes 
an issue. 

To begin to understand what we mean by stability 
in this context, consider a set of N D-branes 
wrapped on, say, a K3 surface, at large radius (so 
that world-sheet instanton corrections are small). 
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On the world volume of the D-branes, we have a 
rank-N vector bundle, and in the physical theory on 
that world volume we have a consistency condition 
for supersymmetric vacua, that the vector bundle be 
"Mumford-Takemoto stable." To understand what 
is meant by this condition on a Kahler manifold, let 
w denote the Kahler form, and define the “slope” ju 
of a vector bundle € on a manifold X of complex 
dimension n to be given by 


t fw ^ c1(£) 
ne) = rank € 


where w is the Kahler form. Then, we say that £ is 
(semi-)stable if for all subsheaves F satisfying 
certain consistency conditions, p4(F)(<) < p(€). 

Since the slope of a bundle depends upon the 
Kahler form, whether a given bundle is Mumford- 
Takemoto stable depends upon the metric. In 
general, on a Kahler manifold, the Kahler cone 
breaks up into subcones, with a different moduli 
space of (stable) holomorphic vector bundles in each 
subcone. 

This is a mathematical notion of stability, but it also 
corresponds to physical stability, at least in a regime in 
which quantum corrections are small. If a given 
bundle is only stable in a proper subset of the Kahler 
cone, then when it reaches the boundary of the 
subcone in which it is stable, the gauge field config- 
uration that satisfies the Donaldson-Uhlenbeck-Yau 
partial differential equation splits into a sum of two 
separate bundles. In a heterotic string compactifica- 
tion, this leads to a low-energy enhanced U(1) gauge 
symmetry and D-terms which realize the change in 
moduli space. In D-branes, this means the formerly 
bound state of D-branes (described by an irreducible 
holomorphic vector bundle) becomes only marginally 
bound; a decay becomes possible. 

Pi-stability is a proposal for generalizing the 
considerations above to D-branes no longer wrap- 
ping the entire Calabi-Yau, and including quantum 
corrections. 

In order to define pi-stability, we must first 
introduce a notion of grading y of a D-brane. 
Specifically, for a D-brane wrapped on the entire 
Calabi-Yau X with holomorphic vector bundle €, 
the grading is defined as the mirror to the expression 
Jy ch(£) A I, where II encodes the periods. Close to 
the large-radius limit, this has the form: 


yp(E) --Im log | exp(B + iw) ^ ch(£) 
X 


x A Jtd( TX) 4---- 


where B is a 2-form, the “B field." As defined w is 
clearly S'-valued; however, we must choose a 
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particular sheet of the log Riemann surface, to 
obtain an R-valued function. 

This notion of grading of D-branes is an ansatz, 
introduced as part of the definition of pi-stability. 
Physically, it is believed that the difference in grading 
between two D-branes corresponds to the fractional 
charge of the boundary-condition-changing vacuum 
between the two D-branes, though we know of no 
convincing first-principles derivation of that state- 
ment. In particular, unlike closed-string computa- 
tions, the degree of the Ext group element 
corresponding to a particular boundary R-sector 
state is not always the same as the U(1)p charge 一 
for example, it is often determined by the U(1); 
charge minus the charge of the vacuum. The grading 
gives us the mathematical significance of that vacuum 
charge. This mismatch between Ext degrees and 
U(1)g charges is necessary for the grading to make 
sense: Ext group degrees are integral, after all, yet we 
want the grading to be able to vary continuously, so 
the grading had better not be the same as an Ext 
group degree. 

Given an R-valued function from a particular 
definition of log in the definition of y above, the 
statement of pi-stability is then that for all 
subsheaves F, as in the statement of Mumford- 
Takemoto stability, 


PIF) € v(£) 


Before trying to understand the physical meaning 
of y, or the extension of these ideas to derived 
categories, let us try to confirm that Mumford- 
Takemoto stability emerges as a limit of pi-stability. 

For simplicity, suppose that X is a Calabi-Yau 
3-fold. Then, for large Kahler form w, we can 
expand q(£) as, 


P(E) ~~Im log EIS J 


3 fyw* ^e(£) 
m feke 


Thus, we see that to leading order in the Kähler 
form w, y(F) < (£) if and only if 


fx ^ c1(F) x Ix ^ c1(€) 
rk F = rk € 


which is precisely the statement of Mumford- 
Takemoto stability on a 3-fold X. 

One can define a notion of (classical) stability for 
more general sheaves, but what one wants is to 
apply pi-stability to derived categories, not just 
sheaves. 


However, there is a technical problem that limits 
such an extension. Specifically, in a derived cate- 
gory, there is no meaningful notion of “subobject.” 
Thus, a notion of stability formulated in terms of 
subobjects cannot be immediately applied to derived 
categories. There are two (equivalent) workarounds 
to this issue that have been discussed in the math 
and physics literatures, which can be briefly sum- 
marized as follows: 


1. One workaround involves picking a subcategory 
of the derived category that does allow you to 
make sense of subobjects. Such a structure is 
known, loosely, as a “T-structure,” and so one 
can imagine formulating stability by first picking 
a T-structure, then specifying a slope function on 
the elements of the subcategory picked out by the 
subcategory. 

2. Another (equivalent) workaround is to work with 
a notion of “relative stability." Instead of speak- 
ing about whether a D-brane is stable against 
decay into any other object, one only speaks 
about whether it is stable against decay into pairs 
of specified objects. 


In this fashion, one can make sense of pi-stability for 
derived categories. 


See also: Fourier-Mukai Transform in String Theory; 
Mirror Symmetry: A Geometric Survey; Spectral 
Sequences; Superstring Theories; Topological Quantum 
Field Theory: Overview. 
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Introduction 


The theory of random point fields has its origins in 
such diverse areas of science as life tables, particle 
physics, population processes, and communication 
engineering. A standard reference to the subject is 
the monograph by Daley and Vere-Jones (1988). 

This article is concerned with a special class of 
random point fields, introduced by Macchi in the mid- 
1970s. The model that Macchi considered describes 
the statistical distribution of a fermion system in 
thermal equilibrium. Macchi proposed to call the new 
class of random point processes the fermion random 
point processes. The characteristic property of this 
family of random point processes is the condition that 
k-point correlation functions have the form of deter- 
minants built from a correlation kernel. This implies 
that the particles obey the Pauli exclusion principle. 
Until the mid-1990s, fermion random point processes 
attracted only a limited interest in mathematics and 
physics communities, with the exception of two 
important works by Spohn (1987) and Costin- 
Lebowitz (1995). This situation changed dramatically 
at the end of the last century, as the subject greatly 
benefited from the newly discovered connections to 
random matrix theory, representation theory, random 
growth models, combinatorics, and number theory. 
Things are rapidly developing at the moment. Even the 
terminology has not yet set in stone. Many experts 
currently use the term “determinantal random point 
fields” instead of “fermion random point fields.” We 
follow this trend in our article. 

This article is intended as a short introduction to the 
subject. The next section builds a mathematical 
framework and gives a formal mathematical definition 
of the determinantal random point fields. Then we 
discuss examples of determinantal random point fields 
from quantum mechanics, random matrix theory, 
random growth models, combinatorics, and represen- 
tation theory. This is followed by a discussion of the 
ergodic properties of translation-invariant determi- 
nantal random point fields. We discuss the Gibbsian 
property of determinantal random point fields. 
Central-limit theorem type results for the counting 
functions and similar linear statistics are also dis- 
cussed. The final section is devoted to some general- 
izations of determinantal point fields, namely 
immanantal and Pfaffian random point fields. 


Mathematical Framework 


We start by building a standard mathematical 
framework for the theory of random point pro- 
cesses. Let E be a one-particle space and X a space 
of finite or countable configurations of particles in E. 
In general, E can be a separable Hausdorff space. 
However, for our purposes it suffices to consider 
E=Rf or E — Z?, We usually assume in this section 
that E— R^, with the understanding that all con- 
structions can be easily extended to the discrete case. 
We assume that each configuration £ — (xj),x; € E, 
ic Z!(oric Zi for d > 1), is locally finite. In other 
words, for every compact K C E, the number of 
particles in K, #x(€) =#(x; € K) is finite. 

In order to introduce a o-algebra of measurable 
subsets of X, we first define the cylinder sets. 
Let B C E be a bounded Borel set and let n > 0. We 
call CP = {£ € X:#n(£) =n} a cylinder set. We define 
B as a o-algebra generated by all cylinder sets (i.e., B 
is the minimal c-algebra that contains all CP). 


Definition 1 A random point field is a triplet 
(X, B, Pr), where Pr is a probability measure on (X, B). 


It was observed in the 1960-1970s (see, e.g., Lenard 
(1973, 1975)), that in many cases the most convenient 
way to define a probability measure on (X, B) is via the 
point correlation functions. Let E = RÍ, equipped with 
the underlying Lebesgue measure. 


Definition 2 Locally integrable function px: E* 一 
RÌ is called a k-point correlation function of the 
random point field (X,B, Pr) if, for any disjoint 
bounded Borel subsets A1,..., Am of E and for any 
k; € Zi 1,...,m, X 5, ki=k, the following for- 
mula holds: 


=|. is, PE + IS coin [1] 


1 


where by E we denote the mathematical expectation 
with respect to Pr. In particular, p;(x) is the particle 
density, since 


E#,4 = | ends 


for any bounded Borel ACE. In general, 
pi(X1,...,x4) has the following probabilistic 
interpretation. Let [x1,x; J- dx;],;— 1,..., &, be infini- 
tesimally small boxes around x;, then px (x1,x2,...,x4) 
dx1…'dxk is the probability to find a particle in each 
of these boxes. 
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In the discrete case E — Z7, the construction of a 
random point field is very similar. The probability 
space X and the o-algebra B are constructed 
essentially in the same way as before. Moreover, in 
the discrete case, the set of the countable configura- 
tions of particles can be identified with the set of all 
subsets of E. Therefore, X = (0, 1]^, and B is generated 
by the events {C,,x € E}, where C, — (x € £}. The 
k-point correlation function p(x1,...,x4) is then just 
a probability that a configuration £ contains the 
sites xX1,...,x;. In other words, pi(x1,...,x4) = 
Pr ( = C,,,). In particular, the one-point correlation 
function pi(x),x € Z^, is the probability that a 
configuration contains the site x, that is, 
pi(x) = Pr (Cx). 

The problem of the existence and the unique- 
ness of a random point field defined by its 
correlation functions was studied by  Lenard 
(1973-1975). It is not surprising that Lenard's 
papers revealed many parallels to the classical 
moment problem. In particular, the random point 
field is uniquely defined by its correlation func- 
tions if the distribution of random variables {#4} 
for bounded Borel sets A is uniquely determined 
by its moments. 

In this article we study a special class of random 
point fields introduced by Macchi (1975). To 
shorten the exposition, we give the definitions only 
in the continuous case E — R*. In the discrete case, 
the definitions are essentially the same. 

Let K:L(R^)— L?(R7) be an integral locally 
trace-class operator. The last condition means that 
for any compact B c R^ the operator Kxs is trace 
class, where yg(x) is an indicator of B. The kernel of 
K is defined up to a set of measure zero in Rf x R. 
For our purposes, it is convenient to choose it in 
such a way that for any bounded measurable B and 
any positive integer n 


tros Kx) = | K(æ,x)dx p 
B 

We refer the reader to Soshnikov (2000, p. 927) for 

the discussion. We are now ready to define a 

determinantal (fermion) random point field on R^. 


Definition 3 A random point field on E is said to 
be determinantal (or fermion) if its #-point correla- 
tion functions are of the form 


palets o voen) = det(K(m n) sies - — [3 


Remark 1 If the kernel is Hermitian-symmetric, 
then the non-negativity of n-point correlation 
functions implies that the kernel K(x,y) is non- 
negative definite and, therefore K must be a 


non-negative operator. It should be noted, how- 
ever, that there exist determinantal random point 
fields corresponding to non-Hermitian kernels (see, 
e.g., [18] later). The kernel K(x, y) is usually called 
a correlation kernel of the determinantal random 
point process. 


In the Hermitian case, the necessary and sufficient 
conditions on the operator K to define a determi- 
nantal random point filed were established by 
Soshnikov (2000); see also Macchi (1975). 


Theorem 1 . Hermitian locally trace class operator 
K on L?(E) determines a determinantal random 
point field if and only if 0 € K € 1 (in other words, 
both K and 1 — K are non-negative operators). If 
tbe corresponding random point field exists, it is 
unique. 


The main technical part of the proof is the 
following proposition. 


Proposition 1 Let (X,B,P) be a determinantal 
random point field with the Hermitian-symmetric 
correlation kernel K. Let f be a non-negative 
continuous function with compact support. Then 


Eel = de (1d - (1 — ef) ^K(1—e/)/7) p 


where (£,f) is the value of the linear statistics 
defined by tbe test function f on the configuration 
€ — (xij); in other words, (&, f) = Xif (xi). 


Remark 2 The right-hand side (RHS) of [4] is well 
defined as the Fredholm determinant of a trace- 
class operator. Letting f= S>*_, sixl;, one obtains 
Re" =E [f ,2;', with z;=e%. In this case, the 
left-hand side (LHS) of [4] becomes the generating 
function of the joint distribution of the counting 


random variables #;,i=1,...,k. 


Unfortunately, there are very few known results 
in the non-Hermitian case. In particular, the 
necessary and sufficient condition on K for the 
existence of the determinantal random point field 
with the non-Hermitian correlation kernel is not 
known. 

We end this section with the introduction of the 
Janossy densities (a.k.a. density distributions, exclu- 
sion probability densities, etc.) of a random point 
field. 

The term Janossy densities in the theory of 
random point processes was introduced by Sriniva- 
san in 1969, who referred to the 1950 paper by 
Janossy on particle showers. Let us assume that all 
point correlation functions exist and are locally 
integrable, and let / be a bounded Borel subset of 


Rf. Intuitively, one can think of the Janossy density 
J p, (X15 e.. ;光大 X1,...,X4 € I, as 


k 
1 ] [ dxi 
=I 


= Pr{there are exactly k particles in I and 
there is a particle in each of 
the k infinitesimal boxes (xj, x; + dx;), 
sl [5] 


To give a formal definition, we express point 
correlation functions in terms of Janossy densities 
and vice versa: 


py X1, AL te) 


OO 
1 
= “a J Es Ts Dee ~~ se) 
y jt li 
J= 


x Ax p44 TT dx, ; [6] 


X AX p44 de [7] 


A very useful property of the Janossy densities is 
that 


Pr{there are exactly k particles in I} 
1 
=a J gi, +++, Xp)dxy dk [8] 
k! Jr 


In the case of determinantal random point fields, 
Janossy densities also have a determinantal form, 
namely 


rt Y 


= det(Id = Kj) à; det(L;(xj, x;)) [9] 


1<ij<k 
In the last equation, K; is the restriction of the operator 
K to the L*(I. In other words, Kij(x,y)— 
xi(x)K(x, y)xi(y), where x; is the indicator of I. The 
operator L; is expressed in terms of Ki as L; = (Id 一 
Kj) ! K;. For further results on the Janossy densities of 
determinantal random point processes we refer the 
reader to Soshnikov (2004) and references therein. 


Examples of Determinantal Random 
Point Fields 


Fermion Gas 


Let H — —d^/dx? 4- V(x) be a Schrödinger operator 
with discrete spectrum on L^(E). We denote by 
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[v]; 9 an orthonormal basis of the eigenfunctions, 
Hope = M - pe, ào < A1 < An < +++. To define a Fermi 
gas, we consider the zth exterior power of H, 
A"(H): ^" (L7(E)) 一 ^A"(L^(E), where A"(L^(E)) is 
the space of square-integrable antisymmetric func- 
tions of n variables and A"(H) = Y ,(—d'/dx2 + 
V(x;)). The eigenstates of the Fermi gas are given by 
the normalized Slater determinants 
1 sT 
"-—L 32,00 I] Pk; (Xo(i)) 


' g€8, 
1 
-—ude(e heus, O 


where 0 € ky « k; «--- <k,. A probability distribu- 
tion of z particles in the Fermi gas is given by the 
squared absolute value of the eigenstate: 


p(xi,..., Xn) = (x1, ..., x4)? 


x det (ex, (i) ) 


= - det(K, (xi, x;)) 


n 


1<ij<n 
1<i,<n [1 1 


where K,(x,y)—-» 5; ,vxw(x)pu(y) is the kernel 
of the orthogonal projector onto the subspace 
spanned by the n eigenfunctions {y,.} of H. The 
n-dimensional probability distribution [11] 
defines a determinantal random point field with 
n particles. The k-point correlation functions are 
given by 


(n) 


EE 
P; (essen) me | Purto 


X dxyj1 `- dx, 


= det(K,(x1,x;)) [12] 


1<ij<k 


Random Matrix Models 


Some of the most important ensembles of random 
matrices fall into the class of determinantal random 
point processes. 

The archetypal ensemble of Hermitian random 
matrices is a so-called Gaussian unitary ensemble 
(GUE). Let us consider the space of n x n Hermitian 
matrices {A = (Ajj) 1<i,j<n» Re(Ajj) = Re(Aj), Im(A;) = 
—Im(Aj;;)}. A GUE random matrix is defined by its 
probability distribution 


P(dA) — const, : exp( —trA?)dA [13] 
where dA is a Lebesgue measure, that is, 


dA = [[,.; dRe(A;) dIm(A;) [T; .; dAgg. The eigenva- 
lues of a random Hermitian matrix are real random 
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variables, whose joint probability distribution is a 
determinantal random point process of particles 
on the real line. The correlation kernel has the 
Christoffel-Darboux form built from the Hermite 
polynomials. 

The GUE ensemble of random matrices is invar- 
iant under the unitary transformation A — UAU ', 
U € U(z). An important generalization of [13] that 
preserves the unitary invariance is 


P(dA) — const, exp( —trV(A))dA [14] 


where, for example, V(x) is a polynomial of even 
degree with positive leading coefficients. The corre- 
lation functions of the eigenvalues in [14] are again 
determinantal, and the Hermite polynomials in the 
correlation kernel have to be replaced by the 
orthonormal polynomials with respect to the weight 
exp (— V(x)). For details, we refer the reader to the 
monographs by Mehta (2004) and Deift (2000). 

There are many other ensembles of random 
matrices for which the joint distribution of the 
eigenvalues has determinantal point correlation 
functions: classical compact groups with respect to 
the Haar measure, complex non-Hermitian Gaus- 
sian random matrices, positive Hermitian random 
matrices of the Wishart type, and chains of 
correlated Hermitian matrices. We refer the reader 
to Soshnikov (2000) for more information. 


Discrete Translation-Invariant Determinantal 
Random Point Fields 


Let g: T — [0,1] be a Lebesgue-measurable func- 
tion on the d-dimensional torus T7. Assume that 
0 € g € 1. A configuration € in Z can be thought of 
as a 0-1 function on ZZ, that is, £(x) — 1 if x € € and 
£(x) — 0 otherwise. We define a Z^-invariant prob- 
ability measure Pr on the Borel sets of X — (0, 1]^ in 
such a way that 


:= det(g(x; 一 Xi) venies [15] 


for x1,...,x, € Z^. In the above formula, {g(n)) 
are the Fourier coefficients of g, that is, 
g(x) = 5, &(n)e™*. It is clear from Definition 3 that 
[15] defines a determinantal random point field on 
Z? with the translation-invariant kernel Kix, y)= 
g(x — y). Below we discuss several examples that fall 
into this category. For further discussion we refer the 
reader to Lyons (2003) and Soshnikov (2000). 


1. In the trivial case when g is identically a constant 
p € [0,1], we obtain the i.i.d. Bernoulli prob- 
ability measure. 


2. The edges of the uniform spanning tree in Z^ 
parallel to the horizontal axis can be viewed as 
the determinantal random point field in Z? with 


sin^ zx 
g(x,y) = FO HAM Sd 
sin 7X + sin TY 
Similarly, the edges of the uniform spanning 
forest in Z4 parallel to the x-axis correspond to 
the function 


sin? T X1 


A 
ye, sin? mx; 


gí(x1,.. 


(the uniform spanning forest on Z^ is a tree only 
for d X4). The result is due to Burton and 
Pemantle (1993). 

3. Let d — 1 and y be a parameter between 0 and 1. 
Consider 


(i4) 
gx) =- 
| 一 中 
The corresponding probability measure is a 
renewal process and 


— a(n) — L2 nl 
Bp) = BUH) = 
(see Soshnikov (2000)). 

4. The process with g(x)=y (x), where I is an 
arbitrary arc of a unit circle, appeared in 
the work of Borodin and Olshanski (2000). The 
corresponding correlation kernel is known as the 
discrete sine kernel. The determinantal random 
point process on 又 : with the discrete sine kernel 
describes the typical form of large Young 
diagrams “in the bulk" (see the next subsection). 

5. The discrete sine correlation kernel with g = xo, 1/2) 
appeared in the zig-zag process (Johansson 2002) 
derived from the uniform domino tilings in the 
plane. It corresponds to g = xo, 1/5]. 


Determinantal Measures on Partitions 


By a partition of nm=1,2,... we understand a 
collection of non-negative integers À= (A1,..., Am) 
such that Ay +---+A,=m and 2252-244. 
We shall use a notation Par(n) for the set of all 
partitions of n. 
The Plancherel measure M,, on the set Par(z) is 
defined as 
2 
M,(A) = =— [16] 
where dim A is the dimension of the corresponding 
irreducible representation of the symmetric group 


S,. Let Par= | |; ,Par(n). Consider a probability 
measure M" on Par 


0" 
M’ (A) ze "=M, (A) where 


A€Par(s), n=0,1,2,..., 0€x0«oo [17] 


M" is called the Poissonization of the measures M,,. 
The analysis of the asymptotic properties of M, and 
M" has been important in connection to the famous 
Ulam problem and related questions in representa- 
tion theory. 

It was shown by Borodin and Okounkov (2000), 
and, independently, Johansson (2001) that M is a 
determinantal random point field. The correspond- 
ing correlation kernel K (in the so-called modified 
Frobenius coordinates) is a so-called discrete Bessel 
kernel on Z/!, 


K(x, y) 


V6 Jixi-1/2( (2/0) Mya (2V/8)—J iix (2/8)]y- 1/2(2V0) 
Ix| — |y| 


if xy > 0 


V. Jix)1/2(2VO)Jiy\—1/2(2VO) Jn 4-1 /2(2VO)J y. 1/2(2 V0) 
x—y 


if xy «0 
[18] 


where ],(-) is the Bessel function of order x. One 
can observe that the kernel K(x,y) is not Hermitian, 
but the restriction of this kernel to the positive and 
negative semiaxis is Hermitian. 

M" is a special case of an infinite parameter family 
of probability measures on Par, called the Schur 
measures, and defined as 


MA) = zs Gs) 19) 


where s; are the Schur functions, x =(x1,x2,...) 
and y —(y1, y2,...) are parameters such that 


Z= 》 sso-[[a-x»' (0 


A€Par ij 


is finite and {x;} ,={yij}72,. It was shown by 
Okounkov (2001), that the Schur measures belong 
to the class of the determinantal random point fields. 


Nonintersecting Paths of a Markov Process 


Let prs(x,y) be the transition probability of a 
Markov process €(t) on R with continuous trajec- 
tories and let (£1(£), &o(t),...,&,(t)) be n independent 
copies of the process. A classical result of Karlin and 
McGregor (1959) states that if n particles start at 
the positions xý’ <x)’ <. <x, then the 
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probability density of their joint distribution at 
time £4 >0, given that their paths have not inter- 
sected for all 0 € £ € ti, is equal to 


| s d 0) (1 
Te, (xi ios xy) = z det(po,n (x; lae What 


provided the process (£1(t), €2(t),... 
a strong Markovian property. 

Let O<t; <t <---<tyy,. The conditional 
probability density that the particles are in the 
positions xt! <x) «...« x! at time t, at 
the aaia x? < x. «ex il at time 75,..., 
at the positions at g 21:8 < --- < x™ at time tm, 
given that at time tm41 they are at the positions 

X c MI <... < xM and their paths have 
"c THAN is iban equal to 


,&,(t)) in R” has 


1 
Mtiti, Wc. slt) 


(I l n 
. Za zl Leen Pros™ o Dia BU 
1-0 


where to = 0. 

It is not difficult to show that [21] can be viewed 
as a determinantal random point process (see, e.g., 
Johansson (2003). 

The formulas of a similar type also appeared in 
the papers by Johansson, Prahofer, Spohn, Ferrari, 
Forrester, Nagao, Katori, and Tanemura in the 
analysis of polynuclear growth models, random 
walks on a discrete circle, and related problems. 


Ergodic Properties 


As before, let (X,B, Pr) be a random point field 
with a one-particle space E. Hence, X is a space of 
the locally finite configurations of particles in E,B a 
Borel o-algebra of measurable subsets of X, and Pr a 
probability measure on (X, B). Thronghout this 
section, we assume E— R^ or Z^. We define an 
action {T‘},-- of the additive group E on X in the 
following natural way: 


T':X 9 X, (T'£), 2 (O; +t [22] 


We recall that a random point field (X, B, P) is 
called translation invariant if, for any Ac B, any 
t € E, Pr(T *A) — Pr(A). The translation invariance 
of the correlation kernel K(x,y)= K(x — y, 0) —: 
K(x — y) implies the translation invariance of 
k-point correlation functions 


py(x1 +t,...,Xp +t) = pa(xi,.... Xp) 
Ac kR = 1,2,...; te E [23] 


which, in turn, implies the translation invariance 
of the random point field. The ergodic properties 
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of such point fields were studied by several 
mathematicians (Soshnikov 2000, Shirai and 
Takahashi, 2003, Lyons and Steif 2003). The 
first general result in this direction was obtained 
by Soshnikov (2000). 


Theorem 2 Let (X,B,P) be a determinantal ran- 
dom point field with a translation-invariant correla- 
tion kernel. Then the dynamical system (X, B, P, (T*]) 
is ergodic, bas the mixing property of any multiplicity 
and its spectra is absolutely continuous. 


We refer the reader to the article on ergodic 
theory for the definitions of ergodicity, mixing 
property, absolute continuous spectrum of the 
dynamical system, etc. 

In the discrete case [15], E— Z^, more is known. 
Lyons and Steif (2003) proved that the shift 
dynamical system is Bernoulli, that is, it is iso- 
morphic (in the ergodic theory sense) to an i.i.d. 
process. Under the additional conditions Spec(K) C 
(0, 1) and X; |n||K(z) > < oo, Shirai and Takahashi 
(2003a) proved the uniform mixing property. 


Gibbsian Properties 


Costin and Lebowitz (1995) were the first to 
question the Gibbsian nature of the determinantal 
random point fields; they studied the continuous 
determinantal random point process on R! with a 
so-called sine correlation kernel 


sin(z(x — y)) 
K(x, y) = ————— 
(x. y) ax wl 
The first rigorous result (in the discrete case) was 
established by Shirai and Takahashi (2003b). 


Theorem 3 Let E be a countable discrete space 
and K a symmetric bounded operator on P(E). 
Assume that Spec(K) C (0,1). Then (X,B,P) is a 
Gibbs measure with tbe potential U given by 
U(x€)=—log(J(x,x) — (g'i jë), where x € E,£ € X, 
{x}M€=. Here J(x,y) stands for tbe kernel of the 
operator ] - (ud — K) !K, and we set Je=U(Y32))y, zet 
and fi —(J(x,y))yee- 


We recall that the Gibbsian property of the 
probability measure P on (X, B) means that 


E[F|Bac](€) = gp ette )F(n U én:) 


S ncA 


where A is a finite subset of E, Bye is the o-algebra 
generated by the B-measurable functions with the 
support outside of A, E[F|B4.] is the conditional 


mathematical expectation of the integrable function 
F on (X, B, P) with respect to the c-algebra Bac. The 
potential U is uniquely defined by the values of 
U(x,£), as follows from the following recursive 
relation: 


Ces TE) e Ur sec esr VE) 
+ U(x41l1x1, od X92] U £) 
4i BE) 


For additional information about the Gibbsian 
property, see Introductory Articles: Equilibrium 
Statistical Mechanics. Much less is known in the 
continuous case. Some generalized form of Gibssian- 
ness, under quite restrictive conditions, was recently 
established by Georgii and Yoo (2004). 


Central Limit Theorem for Counting 
Function 


In this section, we discuss the central-limit theorem 
type results for the linear statistics. The first 
important result in this direction was established 
by Costin and Lebowitz in 1995, who proved the 
central-limit theorem for the number of particles in 
the growing box, 74 ;, 1j, L — oo, in the case of the 
determinantal random point process on R! with the 
sine correlation kernel 


sin(z(x — y)) 


Np) = n(x — y) 


Below we formulate the Costin-Lebowitz theorem 
in its general form due to Soshnikov (1999, 2000). 


Theorem 4 Let E be a one-particle space, (0 < 
K, € 1] a family of locally trace-class operators in 
L^(E,((X,B,P,) a family of the corresponding 
determinantal random point fields in E, and {li} a 
family of measurable subsets in E such that 


Var#,, 


= tr(K;- xn — (Kx. xi) ) >œ as 一 oo [24] 
Then tbe distribution of the normalized number of 
particles in I, (with respect to P,) converges to the 
normal law, that ts, 


#1 — E, w 
V Var#1, 


An analogous result holds for the joint distribu- 
tion of the counting functions {#;,,..., #1 n where 
I,...,I* are disjoint measurable subsets in E. 


-", N(0, 1) 


The proof of the Costin-Lebowitz theorem uses 
the k-point cluster functions. In the determinantal 
case, the cluster functions have a simple form 


ry (X1,-++5%Xk) 


= ote 


| x 


X K(xo(k), Xo(1)) [25] 


, Xo(2)) K(Xo(2) Xo(3)) °°" 


The NIA of the cluster function stems from 
the fact that the integrals of the k-point cluster 
function over the k-cube with a side I can be expressed 
as a linear combination of the first k cumulants of the 
counting random variable #;. In other words, 


J ry(x1,...,x5) dx, dR 
I x...x1 
k 
25317719167) i26] 


I—1 


It follows from [25] that the integral at the LHS of 
[26] equals, up to a factor (— 1) (| — 1)!, to the trace 
of the kth power of the restriction of K to I. This 
allows one to estimate the cumulants of the counting 
random variable #;. For details, we refer the reader 
to Soshnikov (2000). The central-limit theorem for a 
general class of linear statistics, under some techni- 
cal assumptions on the correlation kernel was 
proved in Soshnikov (2002). Finally, we refer the 
reader to Soshnikov (2000) for the functional 
central-limit theorem for the empirical distribution 
function of the nearest spacings. 


Generalizations: Immanantal and Pfaffian 
Point Processes 


In this section, we discuss two important general- 
izations of the determinantal point processes. 


Immanantal Processes 


Immanantal random point processes were introduced 
by P Diaconis and S N Evans in 2000. Let A be a 
partition of n. Denote by x^ the character of the 
corresponding irreducible representation of the sym- 
metric group S,,. Let K(x, y), be a non-negative-definite, 
Hermitian kernel. An immanantal random point 
process is defined through the correlation functions 


SOS. 


aéS,, 


a) I] K(x;, Xali) [27] 
i=] 


In other words, the correlation functions are given by 
the immanants of the matrix with the entries 
K(x;,x;). We will denote the RHS of [27] by 
Ko (Oe 
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In the special case A—(1") (i.e., A consists of n 
parts, all of which equal to 1), one obtains that 
X (o)-(-1Y), and K[x,...,x,] = det(K(xi, xj)). 
Therefore, in the case A—(1") the random point 
process with the correlation functions [27] is a 
determinantal random point process. When À= (n) 
(Le., the permutation has only one part, namely n) we 
have x^-1 identically, and  K?[xi,...,x,] — 
per(K(x;, x;)), the permanent of the matrix K(x;, x;). 
The corresponding random point process is known as 
the boson random point process. 


Pfaffian Processes 


Let 


—o( Ku(xy) Kiz(x,y) 
K(x, y) = [A "anl 


be an vim 2 x 2 matrix-valued kernel, that 
is, K;(x, y) = — Ki(y, x), i,j — 1, 2. The kernel defines 
an integral operator acting on L^(E) @ L (E), which 
we assume to be locally trace class. A random point 
process on E is called Pfaffian if its point correlation 
functions have a Pfaffian form 


DE... Mp) = "T k>1 [28] 
The RHS of [28] is the Pfaffian of the 2k x 2k 


antisymmetric matrix (since each entry K(x;,x;) is a 
2 x 2 block). Determinantal random point processes 
is a special case of the Pfaffian processes, corre- 
sponding to the matrix kernel of the form 


es 0 K(x, y) 
SIR Cka 0 ) 


where K is a scalar kernel. The most well known 
examples of the Pfaffian random point processes, 
that cannot be reduced to determinantal form are 
B=1 and B=4 polynomial ensembles of random 
matrices and their limits (in the bulk and at the edge 
of the spectrum), as the size of a matrix goes to 
infinity. 
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Introduction 


Consider the dynamical system on R^ described by 
the equation 


ý= = = G(u) + F(u) [1] 
dt 

where F,G:S c R^ — R^ are analytic functions 

and s a real (small) parameter. Suppose also that for 

€ —0 a solution uw:R — S (for some initial condi- 

tion 49(0) =) is known. 

We look for a solution of [1] which is a 
perturbation of uo, that is, for a solution 4 which 
can be written in the form u=Ħuo+ U, with 
U=O(e) and U(0) =U = u(0) — 4. Then we con- 
sider the variational equation 

U=M(t)U+ 9(t), Mij(t) —-0,G;(uo(t)) [2] 
where ®(t) = ®(uo(t), U), with (uo, U) = G(uo + U) 
—G(uo) — 9,G(uo)U + EF(uo + U). By defining the 
Wronskian matrix W as the solution of the 
matrix equation W=M(t)W such that W(0)—1 
(the columns of W are given by d independent 


solutions of the linear equation zz — M(t)u), we can 
write 


U(t) = W(t)U + W(t) fi drW-(r)&(r) [8] 


If we expect the solution U to be of order £, we can 
try to write it as a Taylor series in e, that is, 


U(t) = 3 Ue [4] 
k=1 


and, by inserting [4] into [3] and equating the 
coefficients with the same Taylor order, we 
obtain 


uM (t) - W(r)U'? 
t 
+ wo f dr W-!(r)9(9 (7) [5] 
0 
where f% (t) is defined as 


(2) = F(uo(t)) 


(1) 2 Y EZS (ug) 


» usr) TT Uke) 


si D! Our ki hy 
oy, L Ok 

+ dpi au (uo(t)) 

x > pai) onal Uk) k>2 [6] 
ki + +Rp=k—-1 


Hence ®'*)(t) depends only on coefficients of orders 
strictly less than k. In this way, we obtain an 
algorithm useful for constructing the solution 
recursively, so that the problem is solved, up to 
(substantial) convergence problems. 


Historical Excursus 


The study of a system like [1] by following the 
strategy outlined above can be hopeless if we do not 
make some further assumptions on the types of 
motions we are looking for. 

We shall see later, in a concrete example, that the 
coefficients U'*)(t) can increase in time, in a k- 
dependent way, thus preventing the convergence of 
the series for large t. This is a general feature of this 
class of problems: if no care is taken in the choice of 
the initial datum, the algorithm can provide a 
reliable description of the dynamics only for a very 
short time. 

However, if one looks for solutions having a 
special dependence on time, things can work better. 
This happens, for instance, if one looks for quasiper- 
iodic solutions, that is, functions which depend on 
time through the variable w=wt, with w € RN a 
vector with rationally independent components, 
that is such that w-vy 40 for all v € Z^ \ {0} 
(the dot denotes the standard inner product, 
w-:v=win +- uwwNvN). A typical problem of 
interest is: what happens to a quasiperiodic solution 
uo(t) when a perturbation £F is added to the 
unperturbed vector field G, as in [1]? Situations of 
this type arise when considering perturbations of 
integrable systems: a classical example is provided by 
planetary motion in celestial mechanics. 

Perturbation series such as [4] have been extensively 
studied by astronomers in order to obtain a more 
accurate description of the celestial motions compared 
to that following from Kepler's theory (in which all 
interactions between planets are neglected and the 
planets themselves are considered as points). In 
particular, we recall the works of Newcomb and 
Lindstedt (series such as [4] are now known as 
Lindstedt series). At the end of the nineteenth century, 
Poincaré showed that the series describing quasiper- 
iodic motions are well defined up to any perturbation 
order k (at least if the perturbation is a trigonometric 
polynomial), provided that the components of w are 
assumed to be rationally independent: this means that, 
under this condition, the coefficients U% (f) are 
defined for all k € N. However, Poincaré also showed 
that, in general, the series are divergent; this is due to 
the fact that, as seen later, in the perturbation series 
small divisors w- v appear, which, even if they do not 
vanish, can be arbitrarily close to zero. 
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The convergence of the series can be proved 
indeed (more generally for analytic perturbations, or 
even those that are differentiably smooth enough) by 
assuming on w a stronger nonresonance condition, 
such as the Diophantine condition 


"EIS Vv € Z^ \ {0} [7] 
where |v|—]|wi| ---:---|vx|, and Co and 7 are 
positive constants. We note that the set of vectors 
satisfying [7] for some positive constant Co have full 
measure in R^ provided one takes 7 > N — 1. 

Such a result is part of the Kolmogorov-Arnold- 
Moser (KAM) theorem, and it was first proved by 
Kolmogorov in 1954, following an approach quite 
different fom the one described here. New proofs 
were given in 1962 by Arnol'd and by Moser, but 
only very recently, in 1988, Eliasson gave a proof in 
which a bound C* is explicitly derived for the 
coefficients U% (t), again implying convergence for < 
to be small enough. 

Eliasson's work was not immediately known widely, 
and only after publication of papers by Gallavotti and 
by Chierchia and Falcolini, in which Eliasson's ideas 
were revisited, did his work become fully appreciated. 
The study of perturbation series [4] employs techni- 
ques very similar to those typical of a very different 
field of mathematical physics, the quantum field 
theory, even if such an analogy was stressed and 
used to full extent only in subsequent papers. 

The techniques have so far been applied to a wide 
class of problems of dynamical systems: a list of 
original results is given at the end. 


A Paradigmatic Example 


Consider the case S = A x T^, with A an open subset 
of RN, and let Ho: A—R and f:.Ax TN >R 
be two analytic functions. Then consider the Hamilto- 
nian system with Hamiltonian H(A, a) — Ko(A) + 
ef(A,a). The corresponding equations describe a 
dynamical system of the form [1], with u=(A,aq), 
which can be written explicitly: 


A = —c0nf (A, a) 
B= OaFlo(A) + cOaf (A, a) 
Suppose, for simplicity, Ho(A)=A?/2 and 


f(A,a)=fl(a), where A*=A-A. Then, we obtain 
for a the following closed equation: 


à = —e0,f (o) [9] 


while A can be obtained by direct integration once 
[9] has been solved. For ¢=0,[9] gives trivially 
a=ao(t)=ao+wt, where w=O,4Ho(Ao)=Ao is 


[8] 
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called the rotation (or frequency) vector. Hence, for 
¢=0 all solutions are quasiperiodic. We are inter- 
ested in the preservation of quasiperiodic solutions 


when e Æ 0. 


For € Æ 0, we can write, as in [3], 


a(t) = Y a(t) [10] 


k=] 


a =ao(t)+ a(t), 


where a‘*) is determined as the solution of the 
equation 


a9 — tA” 十 a" (t) 
一 ^ T T! alr 4-9 
J ar aribat) 11] 


with [O,f(a(7/)] ^7 expressed as in [6]. 

The quasiperiodic solutions with rotation vector 
w could be written as a Fourier series, by 
expanding 


a(t) = ea [12] 


vezN 


with w as before. If the series [10], with the Taylor 
coefficients as in [12], exists, it will describe a 
quasiperiodic solution analytic in e, and in such a 
case we say that it is obtained by continuation of the 
unperturbed one with rotation vector w, that is 
ao(t). 

Suppose that the integrand [0,f(a(7’)] in 
[11] has vanishing average. Then the integral over 
7 in [11] produces a quasiperiodic function, which 
in general has a nonvanishing average, so that 
the integral over 7 produces a quasiperiodic 
function plus a term linear in t. If we choose A 
in [11] so as to cancel out exactly the term linear 
in time, we end up with a quasiperiodic function. 
In Fourier space, an explicit calculation gives, for 


all v Æ 0, 


(k—1) 


1 
y ivf, 
a = IZ], 
Cr 
P MEME GN (vo) tks) qp) 
vU , 2 p! VI “Vp 
(w V) p=1 kyt=tkp=k-] 
uv dep - 
k 2 [13] 


which again is suitable for an iterative construction 
of the solution. The coefficients al? are left 
undetermined, and we can fix them (arbitrarily) as 
identically vanishing. 

Of course, the property that the integrand in 
[11] has zero average is fundamental; otherwise, 
terms increasing as powers of t would appear (the 


so-called secular terms). Indeed, it is easy to 


realize that, if this happened, to order k terms 
proportional to £% could be present, thus requir- 
ing, at best, |e| < |t| ^ for convergence up to time t. 
This would exclude a fortiori the possibility of 
quasiperiodic solutions. 

The aforementioned property of zero average can 
be verified only if the rotation vector is nonresonant, 
that is, if its components are rationally independent 
or, more particularly, if the Diophantine condition 
[7] is satisfied. Such a result was first proved by 
Poincaré, and it holds irrespective of how the 
parameters 4 appearing in [11] are fixed. This 
reflects the fact that quasiperiodic motions take 
place on invariant surfaces (KAM tori), which can 
be parameterized in terms of the angle variables 
a(t), so that the values a) contribute to the initial 
phases, and the latter can be arbitrarily fixed. 

The recursive equations [13] can be suitably 
studied by introducing a diagrammatic representa- 
tion, as explained below. 


Graphs and Trees 


A (connected) graph G is a collection of points, 
called vertices, and lines connecting all of them. We 
denote with V(G) and L(G) the set of vertices and 
the set of lines, respectively. A path between two 
vertices is a minimal subset of L(G) connecting the 
two vertices. A graph is planar if it can be drawn in 
a plane without graph lines crossing. 

A tree is a planar graph G containing no closed 
loops (cycles); in other words, it is a connected 
acyclic graph. One can consider a tree G with a 
single special vertex vo: this introduces a natural 
partial ordering on the set of lines and vertices, and 
one can imagine that each line carries an arrow 
pointing toward the vertex vo. We can add an extra 
oriented line 4) connecting the special vertex vo to 
another point which will be called the root of the 
tree; the added line will be called the root line. In 
this way, we obtain a rooted tree 0 defined by 
V(@)= V(G) and L(0)= L(G) U Zo. A labeled tree is 
a rooted tree 0 together with a label function defined 
on the sets V(0) and L(0). 

Two rooted trees which can be transformed into 
each other by continuously deforming the lines in 
the plane in such a way that the latter do not cross 
each other (i.e., without destroying the graph 
structure) will be said to be equivalent. This notion 
of equivalence can also be extended to labeled trees, 
simply by considering equivalent two labeled trees if 
they can be transformed into each other in such a 
way that the labels also match. 

Given two vertices v,w € V(0), we say that w < v 
if v is on the path connecting v to the root line. One 


can identify a line with the vertices it connects; given 
a line / — (v,w), one says that £ enters v and exits w. 
For each vertex v, we define the branching number as 
the number p,, of lines entering v. 

The number of unlabeled trees with k vertices can 
be bounded by the number of random walks with 2k 
steps, that is, by 4*. 

The labels are as follows: with each vertex v we 
associate a mode label v, € Z, and with each line 
we associate a momentum v; € ZN, such that the 
momentum of the line leaving the vertex v is given 
by the sum of the mode labels of all vertices 
preceding v (with v being included): if £= (v', v) 
then w= „<, Vw. Note that for a fixed unlabeled 
tree the branching labels are uniquely determined, 
and, for a given assignment of the mode labels, the 
momenta of the lines are also uniquely determined. 

Define 


ip, 7! 1 
BW A. gee DER 


V. = 
pv! (w * ve) 


where the tensor V, is referred to as the node factor 
of v and the scalar gy as the propagator of the line £. 
One has |f,| < Fe", for suitable positive constants 
F and &, by the analyticity — Then one 
can check that the coefficients a, defined in [12], 
for v Æ 0, can be expressed in terms of trees as 


aW) 一 = » Val(6) 


geo 


[15] 
val6) — | [| V. II gi 
ve V(6) teL(0 
where ©% denotes the set of all inequivalent trees 


with k vertices and with momentum v associated 
with the root line, while the coefficients a can be 
fixed am — 0 for all k > 1, by the arbitrariness of the 
initial wy previously remarked.. The property 
that [3af (o(7 ^] 7? in [11] has zero average for all 
k>1 din that for all lines / € L(0) one has 
gi—(w-w) ^ only for v; Z 0, whereas g;—1 for 
v; —0, so that the EGO values Val(0) are well 
defined for all trees 0. If al? — 0 for all k > 1, then 
ve Æ 0 for all 7 € L(0). 

The proof of [15] can be performed by induction 
on k. Alternatively, we can start from the recursive 
definition [13], whereby the trees naturally arise in 
the following way. 

Represent graphically the coefficient a? as in 
Figure 1; to keep track of the labels k and v, we 
assign k to the black bullet and v to the line. For 
k — 1, the black bullet is meant as a grey vertex (like 
the ones appearing in Figure 3). 
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k 


—-— + 


Figure 1 Graphical representation of a(9. 


Figure 2 Graphical representation of the recursive equation [13]. 


Figure 3 An example of tree to be summed over in [15] for 
k = 39. The labels are not explicitly shown. The momentum of 
the root line is v, so that the mode labels satisfy the constraint 


Jas vio) "v — V 


Then recursive equation [13] can be graphically 
represented as the diagram in Figure 2, provided 
that we associate with the (grey) vertex vo the 
node factor V,,,, with v, =vo and Pv, =p denoting 
the number of lines entering vo, and with the lines 
li 1— 1,...,p, entering vo the momenta r;, respec- 
tively. Of course, the sums over p and over the 
possible assignments of the labels {k;}?_, and (v^ o 
are understood. Each black bullet on the right- 
hand side of Figure 2, together with its exiting line 
looks like the diagram on the left-hand side, so 
that it represents 4(5,;—1,...,p. Note that 
Figure 2 has to be interpreted in the following 
way: if one associates with the diagram as drawn 
in the right-hand side a numerical value (as 
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described above) and one sums all the values over 
the assignments of the labels, then the resulting 
quantity is precisely a4, 

The (fundamental) difference between the black 
bullets on the right- and left-hand sides is that the labels 
k; of the latter are strictly less than k, hence we can 
iterate the diagrammatic decomposition simply by 
expressing again each a\*) as a% in [13], and so on, 
until one obtains a tree with k grey vertices and no black 
bullets; see Figure 3, where the labels are not explicitly 
written. This corresponds to the tree expansion [15]. 

Any tree appearing in [15] is an example of what 
physicists call a Feynman graph, while the diagram- 
matic rules one has to follow in order to associate to 
the tree @ its right numerical value Val(0) are usually 
called the Feynman rules for the model under 
consideration. Such a terminology is borrowed 
from quantum field theory. 


Multiscale Analysis and Clusters 


Suppose we replace [9] with a=«0,f(a), so that 
no small divisors appear (that is, g; — 1 in [14]). 
Then convergence is easily proved for & small 
enough, since (by using the identity 5 , ,eviw) p, =k — 1 
and the inequality e*x^/k! < 1 for all x € R, and all 
k € N), one finds 


4° F : —K|v|/4 —K|v,|/4 
有 els {a e e et [16] 


veu(0) ve€v(0) 


and the sum over the mode labels can be performed 
by using the exponential decay factors e^*^/*, while 
the sum over all possible unlabeled trees gives 4^. In 
particular, analyticity in t follows. 

Of course, the interesting case is when the 
propagators are present. In such a case, even if 
no division by zero occurs, as w-v; 40 (by 
the assumed Diophantine condition [13] and the 
absence of secular terms discussed previously), the 
quantities w - v; in [14] can be very small. 

Then we can introduce a scale h characterizing the 
size of each propagator: we say that a line / has scale 
hp =h > Oifw-w is of order 2~’Cp and scale hp = — t 
if w+ v; is greater than Co (of course, a more formal 
definition can be easily envisaged, for which the reader 
is referred to the original papers). Then, we can bound 
lw + | > 2 ^C, for any £ € L(0), and write 


xO 
I] A < p [125 
fe L(0) b=0 


« Cg 7*2? exp (© 2 log 2 iN.) [17] 
h=ho 


where N;,(@) is the number of lines in L(@) with scale 
h and bo is a (so far arbitrary) positive integer. The 
problem is then reduced to that of finding an 
estimate for N; (0). 

To identify which kinds of tree are the source of 
problems, we introduce the notion of a cluster and 
a self-energy graph. A cluster T with scale þr is a 
connected set of nodes linked by a continuous 
path of lines with the same scale label by or a 
lower one and which is maximal, namely all the 
lines not belonging to T but connected to it have 
scales higher than br and at least one line in T has 
scale br. An inclusion relation is established 
between clusters, in such a way that the innermost 
clusters are the clusters- with lowest scale, and so 
on. Each cluster T can have an arbitrary number 
of lines coming into it (entering lines), but only 
one or zero lines coming out from it (exiting line): 
lines of T which either enter or exit T are called 
external lines. A cluster T with only one entering 
line Æ and with one exiting line £7 such that one 
has va =v will be called a self-energy graph 
(SEG) or resonance. In such a case, the line 好 is 
called a resonant line. Examples of clusters and 
SEGs are suggested by the bubbles in Figure 4; the 
mode labels are not represented, whereas the 
scales of the lines are explicitly written. 

If $,(0) is the number of SEGs whose resonant 
lines have scales b, then N;(0)— N,(0) — S,(0) 
will denote the number of nonresonant lines with 
scale 5. 

A fundamental result, known as Siegel-Bryuno 
lemma, shows that, for some positive constant c, 
one has 


N;(0) € 2" "c 》 || [18] 
ve V(8) 


Figure 4 Examples of clusters and SEGs. Note that the tree 
itself is a cluster (with scale 6), and each of the two clusters with 
one entering and one exiting lines is a SEG only if the momenta 
of its external lines are equal to each other. 


NS y^ =) 3 


RC Lo 
Figure 5 Example of tree whose value grows like a factorial. 


which, if inserted into [17] instead of N,(0), would 
give a convergent series; then po should be chosen in 
such a way that the sum of the series in [17] is less 
than, say, &? vevo |v|/8. 

The bound [18] is a very deep one, and was 
originally proved by Siegel for a related problem 
(Siegel's problem), in which, in the formalism 
followed here, SEGs do not occur; such a bound 
essentially shows that accumulation of small divisors 
is possible only in the presence of SEGs. A possible 
tree with k vertices whose value can be proportional 
to some power of k! is represented in Figure 5, 
where a chain of (k — 1)/2 SEGs, k odd, is drawn 
with external lines carrying a momentum v such that 
wy Co|v| '. 

In order to take into account the resonant lines, 
we have to add a factor (w- v;) ^ for each resonant 
line Z. It is a remarkable fact that, even if there are 
trees whose value cannot be bounded as a constant 
to the power k, there are compensations (that is, 
partial cancellations) between the values of all trees 
with the same number of vertices, such that the sum 
of all such trees admits a bound of this kind. 

The cancellations can be described graphically as 
follows. Consider a tree 0 with a SEG T. Then take 
all trees which can be obtained by shifting the 
external lines of T, that is, by attaching such lines to 
all possible vertices internal to T, and sum together 
the values of all such trees. An example is given in 
Figure 6. The corresponding sum turns out to be 
proportional to (uw- v)^, if v is the momentum of the 
resonant line of T, and such a factor compensates 
exactly the propagator of this line. The argument 
above can be repeated for all SEGs: this requires a 
little care because there are SEGs which are inside 
some other SEGs. Again, for details and a more 
formal discussion, the reader is referred to original 
papers. 


QS 
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The conclusion is that we can take into account 
the resonant lines: this simply adds an extra constant 
raised to the power k, so that an overall estimate C*, 
for some C > 0, holds for U'*)(t), and the conver- 
gence of the series follows. 


Other Examples and Applications 


The discussion carried out so far proves a version of 
the KAM theorem, for the system described by [9], 
and it is inspired by the original papers by Eliasson 
(1996) and, mostly, by Gallavotti (1994). 

Here we list some problems in which original 
results have been proved by means of the diagram- 
matic techniques described above, or by some 
variants of them. These are discussed in the 
following. 

The first generalization one can think of is the 
problem of conservation in quasi-integrable systems of 
resonant tori (that is, invariant tori whose frequency 
vectors have rationally dependent components). Even 
if most of such tori disappear as an effect of the 
perturbation, some of them are conserved as lower- 
dimensional tori, which, generically, become of either 
elliptic or hyperbolic or mixed type according to the 
sign of £ and the perturbation. With techniques 
extending those described here (introducing also, in 
particular, a suitable resummation procedure for 
divergent series), this has been done by Gallavotti 
and Gentile; see Gallavotti et al. (2004) and Gallavotti 
and Gentile (2005) for an account. 

An expansion like the one considered so far can 
be envisaged also for the motions occurring on the 
stable and unstable manifolds of hyperbolic lower- 
dimensional tori for perturbations of Hamiltonians 
describing a system of rotators (as in the previous 
case) plus n pendulum-like systems. In such a case, 
the function G(u) has a less simple form. For n= 1, 
one can look for solutions which depend on time 
through two variables, v —«wt and x=e *, with 
(w,g) € RN*', and w Diophantine as before and g 
related to the timescale of the pendulum. This has 
been worked out by Gallavotti (1994), and then 
used by Gallavotti et al. (1999) to study a class of 
three-timescale systems, in order to obtain a lower 


Figure 6 Example of SEGs whose values have to be summed together in order to produce the cancellation discussed in the text. 


The mode labels are all fixed. 
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bound on the homoclinic angles (i.e., the angles 
between the stable and unstable manifolds of 
hyperbolic tori which are preserved by the perturba- 
tion). The formalism becomes a little more involved, 
essentially because of the entries of the Wronskian 
matrix appearing in [5]. In such a case, the 
unperturbed solution no(t) corresponds to the 
rotators moving linearly with rotation vector w and 
the pendulum moving along its separatrix; a 
nontrivial fact is that if go denotes the Lyapunov 
exponent of the pendulum in the absence of the 
perturbation, then one has to look for an expansion 
in x —e 8 with g= go + O(c), because the perturba- 
tion changes the value of such an exponent. 

The same techniques have also been applied to 
study the relation of the radius of convergence of the 
standard map, an area-preserving diffeomorphism 
from the cylinder to itself, which has been widely 
studied in the literature since the original papers by 
Greene and by Chirikov, both appeared in 1979, 
with the arithmetical properties of the rotation 
vector (which is, in this case, just a number). In 
particular, it has been proved that the radius of 
convergence is naturally interpolated through a 
function of the rotation number known as Bryuno 
function (which has been introduced by Yoccoz as 
the solution of a suitable functional equation 
completely independent of the dynamics); see 
Berretti and Gentile (2001) for a review of results 
of this and related problems. 

Also the generalized Riccati equation z — iu^ — 
2if (wt) + i£? 2 0, where v € T? is Diophantine and f 
is an analytic periodic function of v» — wt, has been 
studied with the diagrammatic technique by Gentile 
(2003). Such an equation is related to two-level 
quantum systems (as first used by Barata), and 
existence of quasiperiodic solutions of the general- 
ized Riccati equation for a large measure set € of 
values of £ can be exploited to prove that the 
spectrum of the corresponding two-level system is 
pure point for those values of s; analogously, one 
can prove that, for fixed.£, one can impose some 
further nonresonance conditons on c, still leaving a 
full measure set, in such a way that the spectrum is 
pure point. (We note, in addition, that, technically, 
such a problem is very similar to that of studying 
conservation of elliptic lower-dimensional tori with 
one normal frequency.) 

Finally we mention a problem of partial differ- 
ential equations, where, of course, the scheme 


described above has to be suitably adapted: this is 
the study of periodic solutions for the nonlinear 
wave equation uy — Uxx + mu — q(u), with Dirichlet 
boundary conditions, where m is a real parameter 
(mass) and y(u) is a strictly nonlinear analytic odd 
function. Gentile and Mastropietro (2004) repro- 
duced the result of Craig and Wayne for the 
existence of periodic solutions for a large measure 
set of periods, and, in a subsequent paper by the 
same authors with Procesi (2005), an analogous 
result was proved in the case m=0, which had 
previously remained an open problem in 
literature. 


See also: Averaging Methods; Integrable Systems and 
Discrete Geometry; KAM Theory and Celestial 
Mechanics; Stability Theory and KAM. 
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Definitions 


The dimer model arose in the mid-twentieth century 
as an example of an exactly solvable statistical 
mechanical model in two dimensions with a phase 
transition. It is used to model a number of physical 
processes: free fermions in 1 dimension, the two- 
dimensional Ising model, and various other 
two-dimensional statistical-mechanical models at 
restricted parameter values, such as the 6- and 
8-vertex models and O(n) models. A number of 
observable quantities such as the *height function" 
and densities of motifs have been shown to have 
conformal invariance properties in the scaling limit 
(when the lattice spacing tends to zero). 

Recently, the model is also used as an elementary 
model of crystalline surfaces in R?. 

A dimer covering, or perfect matching, of a graph 
is a set of edges (“dimers”) which covers every 
vertex exactly once. In other words, it is a pairing of 
adjacent vertices (see Figure 1a which is a dimer 
covering of an 8 x 8 grid). Dimer coverings of a grid 
are sometimes represented as domino tilings, that is, 
tilings with 2 x 1 rectangles (Figure 1b). The dimer 
model is the study of the set of dimer coverings of a 
graph. Typically, the underlying graph is taken to be 
a regular lattice in two dimensions, for example, the 
square grid or the honeycomb lattice, or a finite part 
of such a lattice. 

Dimer coverings of the honeycomb graph are in 
bijection with tilings of plane regions with 60* 
rhombi, also known as lozenges (see Figure 2). 
These tilings in turn are projections of piecewise- 
linear surfaces in R^ composed of unit squares in 
the 2-skeleton of Z?. So one can think of honey- 
comb dimer coverings as modeling discrete surfaces 
in R^. These surfaces are monotone in the sense 
that the orthogonal projection to the plane 
P111 ={(x, y, z))x + y + z=0} is injective. 
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Figure 1 A dimer covering of a grid and the corresponding 
domino tiling. 
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Figure 2 Honeycomb dimers (solid) and the corresponding 
“lozenge” tilings (gray). 


Other models related to the dimer model are: 


e The spanning tree model on planar graphs. The 
set of spanning trees on a planar graph is in 
bijection with the set of dimer coverings on an 
associated bipartite planar graph. Conversely, 
dimer coverings of a bipartite planar graph are 
in bijection with directed spanning trees on an 
associated graph. 

e The Ising model on a planar graph with zero 
external field can be modeled with dimers on an 
associated planar graph. 

® Plane partitions (three-dimensional versions of 
integer partitions). Viewing a plane partition 
along the (1, 1, 1)-direction, one sees a lozenge 
tiling of the plane. 

e Annihilating random walks in one dimension can 
be modeled with dimers on an associated planar 
graph. 

e The monomer-dimer model, where one allows a 
certain density of holes (monomers) in a dimer 
covering. This model is unsolved at present, 
although some partial results have been obtained. 


Gibbs Measures 


The most general setting in which the dimer model 
can be solved is that of an arbitrary planar graph 
with energies on the edges. We define here the 
corresponding measure. 

Let G=(V,E) be a graph and M(G) the set of 
dimer coverings of G. Let £ be a real-valued 
function on the edges of G, with £(e) representing 
the energy associated to a dimer on the bond e. One 
defines the energy of a dimer covering as the sum of 
the energies of those bonds covered with dimers. 
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The partition function of the model on (G, £) is then 
the sum 


Za D e ElC)/kT 
CEM(G) 


where the sum is over dimer coverings. In what 
follows we will take kT = 1 for simplicity. Note that 
Z depends on both G and €. 

The partition function is well defined for a finite 
graph and defines the Gibbs measure, which is 
by definition the probability measure u= ug on 
the set M(G) of dimer coverings satisfying 
u(C)=(1/Z)e"" for a covering C. 

For an infinite graph G with fixed energy function 
E, a Gibbs measure on M(G) is by definition any 
measure which is a limit of the Gibbs measures on a 
sequence of finite subgraphs which fill out G. There 
may be many Gibbs measures on an infinite graph, 
since this limit typically depends on the sequence of 
finite graphs. When G is an infinite periodic graph 
(and £ is periodic as well), it is natural to consider 
translation-invariant Gibbs measures; one can show 
that in the case of a bipartite, periodic planar graph 
the translation-invariant and ergodic Gibbs meas- 
ures form a two-parameter family — see Theorem 3 
below. 

For a translation-invariant Gibbs measure v which 
is a limit of Gibbs measures on an increasing 
sequence of finite graphs G,, one can define the 
partition function per vertex of v to be the limit 


Z= lim Z(G,) i 
noc 
where |G,,| is the number of vertices of G,. The free 
energy, or surface tension, of v is —log Z. 


Combinatorics 
Partition Function 


One can compute the partition function for dimer 
coverings on a finite planar graph G as the Pfaffian 
(square root of the determinant) of a certain 
antisymmetric matrix, the Kasteleyn matrix. The 
Kasteleyn matrix is an oriented adjacency matrix of 
G, indexed by the vertices V: orient the edges of a 
graph embedded in the plane so that each face has 
an odd number of clockwise oriented edges. Then 
define K 2 (Ky) with 


Ky = tee) 


if G has an edge vv’, with a sign according to the 
orientation of that edge, and Ky — 0 if v,v’ are not 


adjacent. We then have the following result of 
Kasteleyn: 


Theorem 1 Z— |Pf(K)| — /| det K|. 


Here Pf(K) denotes the Pfaffian of K. 

Such an orientation of edges (which always exists 
for planar graphs) is called a Kasteleyn orientation; 
any two such orientations can be obtained from one 
another by a sequence of operations consisting of 
reversing the orientations of all edges at a vertex. 

If G is a bipartite graph, that is, the vertices can 
be colored black and white with no neighbors 
having the same color, then the Pfaffian of K is the 
determinant of the submatrix whose rows index the 
white vertices and columns index the black vertices. 
For bipartite graphs, instead of orienting the edges 
one can alternatively multiply the edge weights by a 
complex number of modulus 1, with the condition 
that the alternating product around each face (the 
first, divided by the second, times the third, as so on) 
is real and negative. 

For nonplanar graphs, one can compute the 
partition function as a sum of Pfaffians; for a 
graph embedded on a surface of Euler characteristic 
y, this requires in general 2^^* Pfaffians. 


Local Statistics 


The inverse of the Kasteleyn matrix can be used to 
compute the local statistics, that is, the probability that 
a given set of edges occurs in a random dimer covering 
(random with respect to the Gibbs measure u). 


Theorem 2 Let S—((v1,v3),..., (vog 1, vo4)) be a 
set of edges of G. The probability tbat all these 
edges occur in a u-random covering is 


k 
Pr(S) = (TI Ky. 1 s) Pf, o4 (K^! laa 


i=] 


Again, for bipartite graphs the Pfaffian can be 
made into a determinant. 


Heights 


Bipartite graphs Suppose G is a bipartite planar 
graph. A 1-form on G is simply a function on the set 
of oriented edges which is antisymmetric with respect 
to reversing the edge orientation: f(—e) = —f(e) for 
an edge e. A 1-form can be identified with a flow: 
just flow by f(e) along oriented edge e. The 
divergence of the flow f is then d*f. Let 2 be the 
space of flows on edges of G, with divergence 1 at 
each white vertex and divergence —1 at each black 
vertex, and such that the flow along each edge from 
white to black is in [0, 1]. From a dimer covering M 
one can construct such a flow w(M) € Q: just flow 


one unit along each dimer, and zero on the remaining 
edges. The set 2 is a convex polyhedron in R^ and its 
vertices can be seen to be exactly the dimer coverings. 

Given any two flows w1,w» € Q, their difference is 
a divergence-free flow. Its dual (wi—w) (or 
conjugate flow) defined on the planar dual of G is 
therefore the gradient of a function / on the faces of 
G, that is, (w; — ww) — db, where h is well defined 
up to an additive constant. 

When w; and w come from dimer coverings, b is 
integer valued, and is called the height difference of 
the coverings. The level sets of the function P are 
just the cycles formed by the union of the two 
matchings. If we fix a *base point" covering wo and 
a face fo of G, we can then define the height 
function of any dimer covering (with flow w) to be 
the function 5 with value zero at fọ and which 
satisfies db — (w — wo)”. 


Nonbipartite graphs On a nonbipartite planar 
graph the height function can be similarly defined 
modulo 2. Fix a base covering wo; for any other 
covering w, the superposition of wo and w is a set of 
cycles and doubled edges of G; the function 5 is 
constant on the complementary components of these 
cycles and changes by 1 mod 2 across each cycle. 
We can think of the height modulo 2 as taking two 
values, or spins, on the faces of G, and the dimer 
chains are the spin-domain boundaries. In particu- 
lar, dimers on a nonbipartite graph model can in this 
way model the Ising model on an associated dual 
planar graph. 


Thermodynamic Limit 


By periodic planar graph we mean a graph G, with 
energy function on edges, for which translations by 
elements of Z^ or some other rank-2 lattice T c R? 
are isomorphisms of G preserving the edge energies, 
and such that the quotient G/Z/ is a finite graph. 
Without loss of generality we can take T = Z^. The 
standard example is G — Z^ with £ =0, which we 
refer to as “dimers on the grid." However, other 
examples display different global behaviors and so it 
is worthwhile to remain in this generality. 

For a periodic planar graph G, an ergodic 
probability measure on M(G) is one which is 
translation invariant (the measure of a set is the 
same as any Z*-translate of that set) and whose 
invariant subsets have measure 0 or 1. 

We will be interested in probability measures 
which are both ergodic and Gibbs (we refer to them 
as ergodic Gibbs measures, dropping the term 
"probability"). When G is bipartite, there are 
multiple ergodic Gibbs measures (see Theorem 3 
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below). When G is nonbipartite, it is conjectured 
that there is a single ergodic Gibbs measure. 

In the remainder of this section we assume that G 
is bipartite, and assume also that the Z7-action 
preserves the coloring of the edges as black and 
white (simply pass to an index-2 sublattice if not). 

For integer n > 0 let G, = G/nZ^, a finite graph 
on a torus (in other words, with periodic boundary 
conditions). For a dimer covering M of G,, we 
define (hx, by) € 7? to be the horizontal and vertical 
height change of M around the torus, that is, the net 
flux of w(M) — wọ across a horizontal, respectively 
vertical, cut around the torus (in other words, hy, hy 
are the horizontal and vertical periods around the 
torus of the 1-form w(M) — wo). The characteristic 
polynomial P(z, w) of G is by definition 

P(z,w) = e £(M) zh. by (_1 yay 
MeM(G;) 


here the sum is over dimer coverings M of 
Gi; —G/Z^, and b,,b, depend on M. The poly- 
nomial P depends on the base point wo only by a 
multiplicative factor involving a power of z and w. 
From this polynomial most of the large-scale 
behavior of the ergodic Gibbs measures can be 
extracted. 

The Gibbs measure on G, converges as » — oo to 
the (unique) ergodic Gibbs measure jz with smallest 
free energy F — —log Z. The unicity of this measure 
follows from the strict concavity of the free energy 
of ergodic Gibbs measures as a function of the slope, 
see below. The free energy F of the minimal free 
energy measure is 


F = UR 7 log P(e wi Ste 
(271) Js'xs! Zw 


that is, minus the Mahler measure of P. 

For any translation-invariant measure v on M(G), 
the average slope (s,£) of the height function for v- 
almost every tiling is by definition the expected 
horizontal and vertical height change over one 
fundamental domain, that is, s=E[hb(f + (1,0)) 一 
h(f)| and t= E[b(f + (0, 1)) — b(f)] where f is any 
face. This quantity (s,£) lies in the Newton polygon 
of P(z,w) (the convex hull in R^ of the set of 
exponents of monomials of P). In fact, the points in 
the Newton polygon are in bijection with the 
ergodic Gibbs measures on M(G): 


Theorem 3 When G is a periodic bipartite planar 
grapb, any ergodic Gibbs measure bas average slope 
(s, t) lying in N(P). Moreover, for every point (s,t) € 
N(P) there is a unique ergodic Gibbs measure ys, t) 
with tbat average slope. 
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In particular, this gives a complete description of 
the set of all ergodic Gibbs measures. The ergodic 
Gibbs measure j:(s,t) of slope (s,t) can be obtained 
as the limit of the Gibbs measures on G,, when one 
conditions the configurations to have a particular 
slope approximating (s, t). 


Ronkin Function and Surface Tension 


The Ronkin function of P is a map R:R*—R 
defined for (Bx, By) € R? by 
dz dw 


log P(zeP*, we”) — 


BE a By) = T $1 x $1 ZZ W 


The Ronkin function is convex and its graph is 
piecewise linear on the complement of the amoeba 
A(P) of P, which is the image of the zero set {(z,w) € 
C | P(z,w)=0} under the map (z,w)- (log |z|, 
log |w|) (see Figures 3 and 4 for an example). 

The free energy F(u(s,t)) of (s, t), as a function of 
(s,t) € N(P), is the Legendre dual of the Ronkin 
function of P(z, w): we have 


F(u(s, t)) = R(Bx, By) — SB, — tB; 
where 


apas ƏR (Bx, By) - OR(B,., By) 


js. ðB, 


The continuous map VR: R? — N(P) which takes 
(Bx, By) to (s,t) is injective on the interior of A(P), 
collapses each bounded complementary component of 
A(P) to an integer point in the interior of N(P), and 
collapses each unbounded complementary component 
of A(P) to an integer point on the boundary of N(P). 

Under the Legendre duality, the facets in the 
graph of the Ronkin function (i.e., maximal regions 


wp, | 
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Figure 3 The amoeba of P(z,w)—5--z-- 1/z-- w4- 1/w, 
which is the characteristic polynomial for dimers on the periodic 
"square-octagon" lattice. 


Figure 4 Minus the Ronkin function of P(z, w) - 5-- z - 1/z 
+w-+1/w. 


Figure 5 (Negative of) the free energy for dimers on the 
square-octagon lattice. 


on which R is linear) give points of nondifferentia- 
bility of the free energy F, as defined on N(P). We 
refer to these points of nondifferentiability as 
“cusps.” Cusps occur only at integer slopes (s, ¢) 
(see Figure 5 for the free energy associated to the 
Ronkin function in Figure 4). 

By Theorem 3, the coordinates (Bx, By) can also 
be used to parametrize the set of Gibbs measures 
u(s,t) (but only those with slope (s, t) in the interior 
of N(P) or on the corners of N(P) and boundary 
integer points). This parametrization is not one-to- 
one since when (Bx, By) varies in a complementary 
component of the amoeba, the measure j;(s, t) does 
not change. On the interior of the amoeba the 
parametrization is one-to-one. 

The remaining Gibbs measures, whose slopes are 
on the boundary of N(P), can be obtained by taking 
limits of (Bx, By) along the “tentacles” of the amoeba. 


Phases 


The Gibbs measures p(s,t) can be partitioned into 
three classes, or phases, according to the behavior of 
the fluctuations of the height function. If we 
measure the height at two distant points x; and x2 
in G, the average height difference, E[b(x1) — b(x2)], 
is a linear function of x; — x2 determined by the 
average slope of the measure. The height fluctuation 
is defined to be the random variable b(x4) — h(x2) 一 
E[h(x1) — b(x2)]. This random variable depends on 


the two points and we are interested in its behavior 
when x; and x2 are far apart. 
We say a(s, t) 1s 


1. “Frozen” if the height fluctuations are bounded 
almost surely. 

2. “Rough” (or “liquid”) if the covariance in the 
height function E[h(x1)h(x2)] — E[b(x1)|E[b(x2)] 
is unbounded as |x; — x2|— oc. 

3. “Smooth” (or “gaseous”) if the covariance of the 
height function is bounded but the height 
fluctuations are unbounded. 


The height fluctuations can be related to the decay 
of the entries of K~', which are in turn related to the 
decay of the Fourier coefficients of 1/P. In par- 
ticular, we have 


Theorem 4 The measure jq(s,t) is respectively 
frozen, rough, or smooth according to whether 
(B,, By) - (VR)! (s, t) is in tbe closure of an 
unbounded complementary component of A(p), in 
the interior of A(P), or in the closure of a bounded 
component of (P). 


The characteristic polynomials P which occur in 
the dimer model are not arbitrary: their algebraic 
curves (P —0] are all of a special type known as 
Harnack curves, which are characterized by the fact 
that the map from the zero-set of P in C? to its 
amoeba in R? is at most two-to-one. In fact: 


Theorem 5 By varying tbe edge energies all 
Harnack curves can be obtained as the characteristic 
polynomial of a planar dimer model. 


Local Statistics 


In the thermodynamic limit (on a periodic planar 
graph), local statistics of dimer coverings for the Gibbs 
measure of minimal free energy can be obtained from 
the limit of the inverse of the Kasteleyn matrix on the 
finite toroidal graphs G,. This in turn can be 
computed from the Fourier coefficients of 1/P. 

As an example, let G be the square grid Z* and take 
€ — 0 (which corresponds to the uniform measure on 
configurations for finite graphs). An appropriate 
choice of signs for the Kasteleyn matrix is to put 
weights 1, —1 on alternate horizontal edges and i, —i 
on alternate vertical edges in such a way that around 
each white vertex the weights are cyclically 1,7, — 1, 
—i. For this choice of signs we have 


2r e-i(x-óy) dg do 
2 sin 0 + 2i sin d 


1 2r 


(2m Jo Jo 


This integral can be evaluated explicitly (see Figure 6 
for values of Kio) xy near the origin; by 


E u 
K(0,0),(x.y) = 
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Figure 6 Values of K on Z? with zero energies. 


z 7 a = E. a] 

translation invariance Ko, y) (x.y) = Ko, 0, (x—x/,y-y)) 

and values in other quadrants can be obtained by 
= FU 

Ko, 0), (x, y) T IK o. 0), Cy,3)* : 

As a sample computation, using Theorem 2, the 
probability that the dimer covering the origin points 
to the right and, simultaneously, the one covering 
(0, 1) points upwards is 


K7 Kl 
(0,0),(1,0) — ^(0,0),(0,1) 
K(0,0),(1,)K(0,2),(0.1) det | g K-i ) 
(0,2),(1,0) — ^(0,2),(0,1) 


Another computation which follows is the decay 
of the edge covariances. If el,ez are two edges at 
distance d, then Prlel&ce) — Pr(ei)Pr(e?) decays 
quadratically in 1/d, since K^! ((0,0), (x, y)) decays 
like 1/(|x| + |y]). 


Scaling Limits 


The scaling limit of the dimer model is the limit 
when the lattice spacing tends to zero. 

Let us define the scaling limit in the following 
way. Let €Z? be the square grid scaled by e, so the 
lattice mesh size is e. Fix a Jordan domain U c R? 
and consider for each ¢ a subgraph U, of €Z, 
bounded by a simple polygon, which tends to U as 
«— 0. We are interested in limiting properties of 
random dimer coverings of U,, in the limit as e — 0, 
for example, the fluctuations of the height function 
and edge densities. 
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The limit depends on the (sequence of) boundary 
conditions, that is, on the exact choice of approxi- 
mating regions U,. By changing U, one can change 
the limiting rescaled height function along the 
boundary. It is conjectured that the limit of the 
height function along the boundary of U, (scaled by 
€... and assuming this limit exists) determines 
essentially all of the limiting behavior in the interior, 
in particular the limiting local statistics. 

Therefore, let 4 be a real-valued continuous 
function on the boundary of U. Consider a sequence 
of subgraphs U, of eZ’, as «—0 as above, and 
whose height function along the boundary, when 
scaled by e, is approximating u. We discuss the limit 
of the model in this setting. 


Crystalline Surfaces 


The height function allows us to view dimer cover- 
ings as random surfaces in R?: to a dimer covering 
of G, one associates the graph of its height function, 
extended in a piecewise linear fashion over the edges 
and faces of the dual G*. These surfaces are then 
piecewise linear random surfaces, which resemble 
crystal surfaces in the sense that microscopically (on 
the scale of the lattice) they are rough, whereas their 
long-range behavior is smooth and facetted, as we 
now describe. 

In the scaling limit, boundary conditions as 
described in the last paragraph of the previous 
section are referred to as “wire-frame” boundary 
conditions, since the graph of the height function 
can be thought of as a (random) surface spanning 
the wire frame defined by its boundary values. 

In the scaling limit, there is a law of large 
numbers which says that the Gibbs measure on 
random surfaces (which is unique since we are 
dealing with a finite graph) concentrates, for fixed 
wire-frame boundary conditions, on a single surface 
So. That is, as the lattice spacing e tends to zero, 
with probability tending to 1 the random surface lies 
close to a limiting surface So. The surface Sp is the 
unique surface which minimizes the total surface 
tension, or free energy, for its fixed boundary values, 
that is, minimizes the integral over the surface of the 
F(u(s,t)), where (s,t) is the slope of the surface at 
the point being integrated over. Existence and 
unicity of the minimizer follow from the strict 
convexity of the free energy/surface tension as a 
function of the slope. 

At a point where the free energy has a cusp, the 
crystal surface Sp will in general have a facet, that is, 
a region on which it is linear. Outside of the facets, 
one expects that So is analytic, since the free energy 
is analytic outside the cusps. 


Fluctuations 


While the scaled height function ep in the scaling 
limit converges to its mean value ho (whose graph is 
the surface So), the fluctuations of the unrescaled 
height function h — (1/e)ho will converge in law to a 
random process on U. 

In the simplest setting, that of honeycomb dimers 
with € = 0, and in the absence of facets, the height 
fluctuations converge to a continuous Gaussian 
process, the image of the Gaussian free field on the 
unit disk D under a certain diffeomorphism € 
(depending on ho) of D to U. 

In the particular case hg = 0, is the Riemann map 
from D to U and the law of the height fluctuations 
is just the Gaussian free field on U (defined to be 
the Gaussian process whose covariance kernel is 
the Dirichlet Green's function). The conformal 
invariance of the Gaussian free field is the basis for 
a number of conformal invariance properties of the 
honeycomb dimer model. 


Densities of Motifs 


Another observable of interest is the density field of a 
motif. A motif is a finite collection of edges, taken up 
to translation. For example, consider, for the square 
grid, the *L" motif consisting of a horizontal domino 
and a vertical domino aligned to form an *L," which 
we showed above to have a density 1/47 in the 
thermodynamic limit. The probability of seeing this 
motif at any given place is 1/47. However, in the 
scaling limit one can ask about the fluctuations of the 
occurrences of this motif: in a large ball around a 
point x, what is the distribution of N; — A/47, where 
Ni is the number of occurrences of the motif, and A is 
the area of the ball? These fluctuations form a 
random field, since there is a long-range correlation 
between occurrences of the motif. 

It is known that on Z?, for the minimal free energy 
ergodic Gibbs measure, the rescaled density field 


(med) 


converges as € — 0 weakly to a Gaussian random field 
which is a linear combination of a directional 
derivative of the Gaussian free field and an independent 
white noise. A similar result holds for other motifs. 

The joint distribution of densities of several motifs 
can also be shown to be Gaussian. 


See also: Combinatorics: Overview; Determinantal 
Random Fields; Growth Processes in Random Matrix 
Theory; Statistical Mechanics and Combinatorial 
Problems; Statistical Mechanics of Interfaces. 
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Introduction 


In this article we describe some recent results (Finster 
et al. 1999a,b, 2000 a-c, 2002a) concerning the 
existence of both particle-like, and black hole 
solutions of the coupled Einstein-Dirac- Y ang-Mills 
(EDYM) equations. We show that there are stable 
globally defined static, spherically symmetric solu- 
tions. We also show that for static black hole 
solutions, the Dirac wave function must vanish 
identically outside the event horizon. The latter result 
indicates that the Dirac particle (fermion) must either 
enter the black hole or tend to infinity. 

The plan of the article is as follows. The next 
section describes the background material. It is 
followed by a discussion of the coupled EDYM 
equations for static, spherically symmetric particle- 
like and black hole solutions. The final section of 
the article is devoted to a discussion of these results. 


Background Material 
Einstein's Equations 


We begin by describing the Einstein equation for the 
gravitational field (for more details, see, e.g., Adler 
et al. (1975)). We first note Einstein's hypotheses of 
general relativity (GR): 


(E1) The gravitational field is the metric g; in 3 + 1 
spacetime dimensions. The metric is assumed to 
be symmetric. 

(E2) At each point in spacetime, the metric can be 
diagonalized as diag(—1,1,1,1). 

(E3) The equations which describe the gravitational 
field should be covariant; that is, independent 
of the choice of coordinate system. 


The hypothesis (E1) is Einstein’s brilliant insight, 
whereby he *geometrizes" the gravitational field. 
(E2) means that there are inertial frames at each 
point (but not globally), and guarantees that special 
relativity (SR) is included in GR, while (E3) implies 


that the gravitational field equations must be tensor 
equations; that is, coordinates are an artifact, and 
physics should not depend on the choice of 
coordinates. 


Einstein's Equations of GR 


The metric £jg3(x),440,1,2,3, 3 — (xU ,x* x^), 
x? —ct (c—speed of light, t=time), is the metric tensor 
defined on four-dimensional spacetime. Einstein's 
equations are ten (tensor) equations for the unknown 
metric g;; (gravitational field), and take the form 


Ry — 3 Rey = eT; [1] 


where the left-hand side Gj=Ry—4Rgj is the 
Einstein tensor and depends only on the geometry, 
o=8nxG/c*, where G is Newton’s gravitational 
constant, while T;, the energy-momentum tensor, 
represents the source of the gravitational field, and 
encodes the distribution of matter. (The word 
“matter” in GR refers to everything which can 
produce a gravitational field, including elementary 
particles, electromagnetic or Yang—Mills (YM) fields. 
From the Bianchi identities in geometry (cf. Adler 
et al. (1975)), the (covariant) divergence of the 
Einstein tensor, Gj, vanishes identically, namely 


CP 
so, on solutions of Einstein's equations, 
B 
Ti; — a 0 


and this in turn expresses the conservation of energy 
and momentum. The quantities which comprise the 
Einstein tensor are given as follows: first, from 
the metric tensor gj, we form the Levi-Civita 
connection re defined by: 


pt =] gk er Ogit ad) 


ij 3 
where (4 x 4 matrix) [g*^] = [gee], and summation 
convention is employed; namely, an index which 


appears as both a subscript and a superscript is to 
be summed from 0 to 3. With the aid of T$, we can 


Jg 
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construct the celebrated Riemann curvature tensor 


ho cx 


; OM, OD, 


= | pP | pp 
ake = gyk Be pkl ge — Tpel 


pe” gk 
Finally, the terms R; and R which appear in the 
Einstein tensor Gj are given by 


Ri = Rig 
(the Ricci tensor), and 
R = g!R; 


is the scalar curvature. 

From the above definitions, one sees at once the 
enormous complexity of the Einstein equations. For 
this reason, one usually seeks solutions which have a 
high degree of symmetry, and in what follows, in this 
section, we shall only consider static, spherically 
symmetric solutions; that is, solutions which depend 


only on r= |x| = 4/ (x!)* + (x2)? + (x3)*. In this case, 
the metric gj takes the form 


ds? = —T(r)*dt*? + A(r) td +da? 2) 


where dO? = d0? + sin? 0 d? is the standard metric 
on the unit 2-sphere, 7,0,^ are the usual spherical 
coordinates, and ¢ denotes time. 


Black Hole Solutions 


Consider the problem of finding the gravitational 
field outside a ball of mass M in R?; that is, there is 
no matter exterior to the ball. Solving Einstein's 
equations G;' —0 gives the famous Schwarzschild 
solution (1916): 


ds? = — (1 一 m) c^dt? 
= 
+ (1 ~ - dr + rcd? [3] 


where m = GM/c^. Since 2m has the dimensions of 
length, it is called the Schwarzschild radius. Observe 
that when r=2m, the metric is singular; namely, 
gu — 0 and g,, = oc. By transforming the metric [2] to 
the so-called Kruskal coordinates (cf. Adler et al. 
(1975)), one observes that the Schwarzschild sphere 
r — 2m has the physical characteristics of a black hole: 
light and nearby particles can enter the region r < 2m, 
nothing can exit this region, and there is an intrinsic 
(nonremovable) singularity at the center r = Q. 

For the general metric [2], we define a black hole 
solution of Einstein's equations to be a solution 
which satisfies, for some p > 0, 


A(p)=0, A(r)>0 ifr>p 


p is called the radius of the black hole, or the event 
horizon. 


Yang-Mills Equations 


The YM equations generalize Maxwell's equations. 
To see how this comes about, we first write 
Maxwell's equations in an invariant way. Thus, let 
A denote a scalar-valued 1-form: 
A= A;dx!, A,€R 

which is called the electromagnetic potential (by 
physicists), or a connection (by geometers). The 
electromagnetic field (curvature) is the 2-form 


F — dA 


In local coordinates, 


! ( OA; OA; 

— F.dx! ) speen Á- : 

P= Fd dE, Fy == Da 
In this framework, Maxwell’s equations are given by 
d*F=0, dF=0 " 


where x is the Hodge star operator, mapping 2-forms 
to 2-forms (in R*), and is defined by 


(E. —5V ge; F" 


where g= det(g;) and si is the completely anti- 
symmetric symbol defined by E;ke =sgn(ijk£). As 
usual, indices are raised (or lowered) via the metric, 
so that, for example, 


F’ — g"g" Fn 


It is important to notice that *F depends on the 
metric. Note also that Maxwell's equations are 
linear equations for the A;’s. 

The YM equations generalize Maxwell's equations 
and can be described as follows. With each YM field 
(described below) is associated a compact Lie group 
G called the gauge group. For such G, we denote its 
Lie algebra by g, defined to be the tangent space at the 
identity of G. Now let A be a g-valued 1-form 


A = A;dx' 


where each A; is in g. In this case, the curvature 2-form 
is defined by 


F=dA+AAA 
or, in local coordinates, 
OA; OA; 
Fe = AA; 
i B6 Ox” Ai, A; 
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The commutator [A;,A;]=0 if G is an abelian 
group, but is generally nonzero if G is a matrix 
group. In this framework, the YM equations can be 
written in the form d'F—0, where now d is an 
appropriately defined covariant exterior derivative. 
For Maxwell’s equations, the gauge group G = U(1) 
(the circle group {e®:0 € R}) so g is abelian and we 
recover Maxwell’s equations from the YM equa- 
tions. Observe that if G is nonabelian, then the YM 
equations d'F—0 are nonlinear equations for the 
connection coefficients A;. 


The Dirac Equation in Curved Spacetime 


The Dirac equation is a generalization of Schródinger's 
equation, in a relativistic setting (Bjorken and 
Drell 1964). It thus combines quantum mechanics 
with the theory of relativity. In addition, the Dirac 
equation also describes the intrinsic *spin" of fermions 
and, for this reason, solutions of the Dirac equation are 
often called spinors. 
The Dirac equation can be written as 


(G —m)v —0 [5] 


where G is the Dirac operator, m is the mass of the 
Dirac particle (fermion), and V is a complex-valued 
4-vector called the wave function, or spinor. The 
Dirac operator G is of the form 


G = 1G! (x) Aad + B(x) [6] 
where G’ as well as B are 4 x 4 matrices, m is the 
(rest) mass of the fermion, and i— V.—1. The Dirac 
equation is thus a linear equation for the spinors. 
The G/ (called Dirac matrices) and the Lorentzian 
metric gj are related by 


g"*I =} {G',G*} [7] 
where (G/,G^] is the anticommutator 
(GI, G*) = G/c* 4- G'G 


Thus, the Dirac matrices depend on the underlying 
metric in four-dimensional spacetime. 

Suppose that H is a spacelike hypersurface in R^, 
with future-directed normal vector v — v(x), and let 
du be the invariant measure on H induced by the 
metric gj. We define a scalar product on solutions 
V, $ of the Dirac equation by 


(vj) = f VGiov, du 8] 


This scalar product is positive definite, and because 
of current conservation (cf. Finster (1988)) 


V;V Go = 0 


it is also independent of H. By generalizing 
the expression (due to Dirac), EPY — |vj?, in 
Minkowski space, where 4? and W, the adjoint 
spinors, are defined by 


pog 
y= y = Vy 
0 -1 


where * denotes complex conjugation, and 1 is the 
2x2 identity matrix, the quantity VG/Vr,; is 
interpreted as the probability density of the Dirac 
particle. We normalize solutions of the Dirac 
equation by requiring 


ule) =1 [9] 


Spherically Symmetric EDYM Equations 


In the remainder of this article we assume that all 
fields are spherically symmetric, so they depend 
only on the variable r—|x|. In this case, the 
Lorentzian metric in polar coordinates (t, r, 6, %9) 
takes the form [2]. The Dirac wave function can be 
(Finster et al. 2000b) described by two real 
functions, (a(r), 8(r)), and the potential W(r) corre- 
sponds to the magnetic component of an SU(2) YM 
field. As shown in Finster et al. (2000b), the EDYM 


equations are 


VAa = —a = (m -- wT)8 |10] 
VAB = (-m +wT)a——p [11] 
1 (1—:2y 


"SO 
rå =1-å-7 a 


2 Ay 
— 2wI* (o? + 9^) — Aw [12] 


a 1 (1 —w) 
d = 一 一 一 一 一 mmm 
2rA T^ P+A+S zi 


+ 2mT (o? — 6*) = 2wT* (o? + 8? 


T 
+ 4- wai 一 5 Aw? [13] 


rAw" =—(1—w*)w+e*rTaB 
A'T—2AT’ , 
-parar 


JT [14] 


Equations [10] and [11] are the Dirac equations, 
[12] and [13] are the Einstein equations, and [14] is 
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the YM equation. The constants m,w, and e denote, 
respectively, the rest mass of the Dirac particle, its 
energy, and the YM coupling constant. 


Nonexistence of Black Hole Solutions 


Let the surface r= p > 0 represent a black hole event 
horizon: 


A(p) 2-0, A(r)>0 ifr»p [15] 


In this case, the normalization condition [9] is 
replaced by 


j (a? "D dr < oo, forevery ro» p [16] 


0 


In addition, we assume that the following global 
conditions hold: 


lim r(1 — A(r)) =M «oo [17] 


(finite mass), 


lim T(r) 2 1 [18] 


roo 


(gravitational field is asymptotically flat Minkows- 
kian), and 


lim (1w(r)*, w'(r)) = (1,0) [19] 


To 


(the YM field is well behaved). 
Concerning the event horizon r= p, we make the 
following regularity assumptions: 


1. The volume element 4/|detg;| = | sin 0| A"! T? 
is smooth and nonzero on the horizon; that is, 
TA}, TA € C! ([p, 00) 

2. The strength of the YM field Fy is given by 
2Aw^ | (1— uw?) 

r^ rí 
(cf. Bartnik and McKinnon 1988). We assume that 
this scalar is bounded near the horizon; that is, 


outside the event horizon and near r= p, assume 
that 


tr(Fj F) = 


w and Aw” are bounded [20] 


3. The function A(r) is monotone increasing outside 
of and near the event horizon. 


As discussed in Finster et al. (19992), if assumption 
1 or 2 were violated, then an observer freely falling 
into the black hole would feel strong forces when 


crossing the horizon. Assumption 3 is considerably 
weaker than the corresponding assumption in 
Finster et al. (1999b), where, indeed, it was assumed 
that the function A(r) obeyed a power law 
A(r) 2 c(r — py + O((r —pX*'), with positive con- 
stants c and s, for r > p. 

The main result in this subsection is the following 
theorem: 


Theorem 1 Every black hole solution of the 
EDYM equations [10]-[14] satisfying the regularity 
conditions 1—3 cannot be normalized and coincides 
witb a Bartnik-McKinnon (BM) black bole of the 
corresponding. Einstein-Yang-Mills (EYM) equa- 
tions; that is, the spinors a and ( must vanish 
identically outside the event horizon. 


Remark Smoller and Wasserman (1998) proved 
that any black hole solution of the EYM equations 
that has finite mass (i.e., that satisfies [17]) must be 
one of the BM black hole solutions (Bartnik and 
McKinnon 1988) whose existence was first demon- 
strated in Smoller et al. (1993). Thus, amending the 
EYM equations by taking quantum-mechanical 
effects into account — in the sense that both the 
gravitational and YM fields can interact with Dirac 
particles — does not yield any new types of black 
hole solutions. 


The present strategy in proving this theorem is to 
assume that we have a black hole solution of the 
EDYM equations [10]-[18] satisfying assumptions 
1-3, where the spinors do not vanish identically 
outside of the black hole. We shall show that this 
leads to a contradiction. The proof is broken up 
into two cases: either A^ is integrable or 
nonintegrable near the event horizon. We shall 
only discuss the proof for the case when A~'/? is 
integrable near the event horizon, leaving the 
alternate case for the reader to view in Finster 
et al. (20002). 

If A~'/? is integrable, then one shows that there 
are positive constants c, £ such that 


c € o* (r) 4- 8^(r) < 


S | | 


if p<r<pte [21 


Indeed, multiplying [10] by a, and [11] by 8 and 
adding gives an estimate of the form 


vVA(o? E gy « kla? + g*) 


Upon dividing by V/A(a? + 37) and integrating from 
r > p to p+ e€ gives 


|log(o? + 37)(p + €) — log(o? + 8^)(r)| € const. 
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from which the desired result follows. Next, from 
[12] and [13], 


r(AT?y 24 — wT (a + 8^) 
+ T? |2m(o? — 8^) + W op 


_ =! (Aw) T? [22] 
e 

Using assumption 2. together with the last theorem, 
we see that the coefficients of T^, T°, and T? on the 
right-hand side of [21] are bounded near p, and from 
assumption 1 the left-hand side of [21] is bounded 
near p. Since assumption 1 implies T(r)— oo as 
r\ p, we see that w=0. Since w=0, the Dirac 
equations simplify and we can show that aĝ is a 
positive decreasing function which tends to 0 as 
r— oo. Then the YM equation can be written in the 
form 


r' (Aw) =- w(1— wy 
r(TV A)af 2 (AT?y 
VA 2AT? 


From assumption 2, Aw” is bounded so A?:w^ — 0 as 
rN, p. Thus, from [22] we can write, for r near p, 


Te (Aw) [23] 


C2 
A(r) 


(Aw) (r) > cl 十 


where cl and c? are positive constants. Using this 
inequality, we can show that for r near p, 


A(r) = (r — p)B(r) 


where 0 < lim, B(r) < oc. It follows that A(p) —0 
and A’(p) > 0. Thus, the Einstein metric has the 
same qualitative features as the Schwarzschild 
metric near the event horizon. Hence, the metric 
singularity can be removed via a Kruskal transfor- 
mation (Adler et al. 1975). In these Kruskal 
coordinates, the YM potential is continuous and 
bounded (as is easily verified). As a consequence, the 
arguments in Finster et al. (2000c) go through and 
show that the spinors must vanish identically outside 
the horizon. For this, one must note that continuous 
zero-order terms in the Dirac operator are irrelevant 
for the derivation of the matching conditions in 
Finster et al. 2000c, section 2.4). Thus, the matching 
conditions (equations (2.31), (2.34) of Finster et al. 
(2000c)) are valid without changes in the presence 
of our YM field. Using conservation of the (electro- 
magnetic) Dirac current and its positivity in timelike 
directions, the arguments in Finster et al. (2000c, 
section 4) all carry over. This completes the proof. 
We have thus proved that the only black hole 
solutions of our EDYM equations are the BM black 


holes; that is, the spinors must vanish identically. In 
other words, the EDYM equations do not admit 
normalizable black hole solutions. Thus, in the 
presence of quantum-mechanical Dirac particles, static 
and spherically symmetric black hole solutions do not 
exist. Another interpretation of these our result is that 
Dirac particles can only either disappear into the black 
hole or escape to infinity. These results were proved 
under very weak regularity assumptions on the form of 


the event horizon (see assumptions 1—3). 


Particle-Like Solutions 


By a particle-like (bound state) solution of the (SU(2)) 
EDYM equations, we mean a smooth solution of 
eqns [10]-[14], which is defined for all r > 0, and 
satisfies condition [9], which explicitly becomes 


f (o? + gy YT dr=1 [24] 
A A 


In addition, we demand that [17]-[19] also hold. It 
is easily shown that, near r=0, we must have 


w(r)=1— 2^ + O(P ) [25] 


where À is a real parameter. From this, via a Taylor 
expansion, one finds that 


|26] 


A(r)=1+O0(7), T(r)=To+O(r) [27 
with two parameters a; and Tọ > 0. Using linearity of 
the Dirac equation, we can always assume that o, > 0. 

Under all realistic conditions, the coupling of 
Dirac particles to the YM field (describing the weak 
or strong interactions) is much stronger than the 
coupling to the gravitational field. Thus, we are 
particularly intrested in the case of weak gravita- 
tional coupling. As shown in Finster et al. (2000b), 
the gravitational field is essential for the formation 
of bound states. However, for arbitrarily weak 
gravitational coupling, we can hope to find bound 
states. It is even conceivable that these bound-state 
solutions might have a well-defined limit when the 
gravitational coupling tends to zero, if we let the 
YM coupling go to infinity at the same time. Our 
idea is that this limiting case might yield a system of 
equations which is simpler than the full EDYM 
system, and can thus serve as a physically interesting 
starting point for the analysis of the coupled 
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interactions described by the EDYM equations. 
Expressed in dimensionless quantities, we shall thus 
consider the limits 


m^k-0 and e*— oo [28] 


That is, we ask whether weak gravitational coupling 
can give rise to bound states. Using numerical methods, 
we find particle-like solutions which are stable, even 
for arbitrarily weak gravitational coupling. 

Now assuming that [27] holds (weak gravitational 
coupling), so that (A, T) ~ (1,1), then we find that 
the Dirac equations have a meaningful limit only 
under the assumptions that o converges and that 


mB) ^ B), m(r()-1) e 

[29] 
m(w-—m)-—E 

with two real functions Â, y and a real parameter E. 

Multiplying [29] with m and taking the limits [28] 

as well as A, T — 1, the Dirac equations become 


ol -—a-2Ó [30] 


B - (E ea- 7B 31] 


We next consider the YM equation [14]. The last 
term in [14] drops out in the limit of weak 
gravitational coupling [27]. The second summand 
converges only under the assumption that 


2 
e 
— 34 
— d [32] 
with q a real parameter, playing the role of an 
“effective” coupling constant. Together with [27], 
this implies that m 一 oo. The YM equations thus 
have the limit 


Pw" = —(1—w)w + qraf [33] 


In order to get a well-defined and nontrivial limit of 
the Einstein equations [13] and [14], we need to 
assume that the parameter 77? has a finite, nonzero 
limit. Since this parameter has the dimension of 
inverse length, we can arrange by a scaling of our 
coordinates that 


m1 134] 


We differentiate the T-equation [13] with respect to 7 
and substitute [12]. Taking the limits [28] and [33], a 
straightforward calculation yields the equation 

r Ay = —o* [35] 


where A—r^0,(r0, is the radial Laplacian in 
Euclidean R?. Indeed, this equation can be 


regarded as Newton's equation with the Newtonian 
potential gy. Thus, the limiting case [34] for 
the gravitational field corresponds to taking the 
Newtonian limit. Finally, the normalization con- 
dition [16] reduces to 


| a(r} dr = 1 [36] 
Jo 


The boundary conditions [17]-[19], [24]-[26] are 
transformed into 


wr)=1 427! -O(r) limw(r)=+1 [37] 


7 一 ODS 


a(r)=ayr+O(r), B(r-o(P?) [38] 
e(r) — vo--O(r), limq(r) «oo [39] 


with the three parameters A, 0, and yo. We point 
out that the limiting system contains only one 
coupling constant g. According to [31] and [33], 
d is in dimensionless form given by 


em > q [40] 


Hence, in dimensionless quantities, the limit [17] 
describes the situation where the gravitational cou- 
pling goes to zero, while the YM coupling constant 
goes to infinity like e? ~ (m?) !. Therefore, this 
limiting case is called the reciprocal coupling limit 
(RCL). The reciprocal coupling system is given by 
eqns [29], [30], [32], and [34] together with the 
normalization conditions [35] and the boundary 
conditions [36]-[38]. According to [28], the para- 
meter E coincides up to a scaling factor with w — m, 
and thus has the interpretation as the (properly 
scaled) energy of the Dirac particle. As in Newtonian 
mechanics, the potential y is determined only up to a 
constant u € R; namely, the reciprocal limit equa- 
tions are invariant under the transformation 


p= p+ p, E> Eh [41] 

To simplify the connection between the EDYM 
equations, and the RCL equations, we introduce a 
parameter £ in such a way that as £ — 0, EDYM 一 
RCL; namely, 


E€ = 


Notice that £ describes the relative strength of gravity 
versus the YM interaction. For realistic physical 
situations, the gravitational coupling is weak; 
namely, m7« < 1, but the YM coupling constant is 
of order 1:e? ~ 1. So we investigate the parameter 
range € < 1,q ~ 0. These form the starting points for 
the numeric below. 
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We seek stable bound states for weak gravita- 
tional coupling. For this purpose, we consider the 
total binding energy 


B=M-m [42] 


where M is the ADM mass defined by [17] and m is the 
rest mass of the Dirac particle. B is thus the amount of 
energy set free when the binding is broken. If B « 0, 
then energy is needed to break up the binding. 
According to Lee (1987), a solution is stable if B « 0. 
In order to find solutions of the RCL equations with 
B <0, Lee's treatment and a new two-parameter 
shooting method (Finster et al. 2000b) can be used. 
Stable solutions of these RCL equations then follow 
(see Finster et al. (2000b) for details). 

We now turn to the full EDYM equations. Here 
are the key steps of our method: 


1. Find solutions which are small perturbations of 
the limiting (RCL) solutions. 

2. Trace these solutions by gradually changing the 
coupling constants. 

3. This should yield a one-parameter family of 
solutions which are “far” from the known limit- 
ing solutions. 


The point is that we use the RCL solutions as a 
starting point for numerics, and we *continue" these 
solutions to solutions of the full EDYM equations. 

To be somewhat more specific, we see that if we 
fix e and q, we have two parameters: 


al = œ (0) 


and two conditions at oc: 


and E-2w- 


a? + ^ O. we —1 


We consider the EDYM equations with weaker 
side conditions 


We | (o2 +P) ax co 
Jo A 
0 < r= lim T(r) < oo 


lim z^ (r) = 1 
r—o00 


p = lim r(1 — A(r)) < oo 


Then we rescale these solutions to obtain the true 
side conditions via the transformations 


à(r) = YTA ^ a(A 7r) 
B(r) = VTA BAT) 
A(r) = A(A?r), T(r) 2 1! T(A?r) 


Discussion 


In this article we have considered the SU(2) EDYM 
equations. Our first result shows that the only black 
hole solutions of these equations are the BM black 
holes; that is, the spinors must vanish identically outside 
of the black hole. In other words, the EDYM equations 
do not admit normalizable black hole solutions. Thus, 
as mentioned earlier, this result indicates that the Dirac 
particle either enters the black hole or escapes to 
infinity. Two recent publications (Finster et al. 2002a,b) 
we consider the Cauchy problem for a massive Dirac 
equation in a charged, rotating-black-hole geometry 
(the non-extreme Kerr-Newman black hole), with 
compactly supported initial data outside the black 
hole. We prove that, in this case, the probability that the 
Dirac particle lies in any compact set tends to zero as 
t — oo. This means that the Dirac particle indeed either 
enters the black hole or tends to infinity. We also show 
that the wave function decays at a rate t^?/6 on any 
compact set outside of the event horizon. 

For particle-like solutions of the SU(2) EDYM 
equations, we find stable bound states for arbitrarily 
weak gravitational coupling. This shows that as weak 
as the gravitational interaction is, it has a regularizing 
effect on the equations. The stability of particle-like 
solutions of the EDYM equations is in sharp contrast 
to the EYM equations, where the particle-like solu- 
tions are all unstable (Straumann and Zhou 1990). 
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Introduction 


The Dirac equation arose in the early days of 
quantum mechanics, inspired by the problem of 
taking special relativity into account in the quantum 
mechanical description of a freely moving electron. 
From the outset, however, Dirac looked for an 
equation that also accomodated the electron spin 
and that could be modified to include interaction 
with an external electromagnetic field. The equation 
he discovered satisfies all of these requirements. On 
the other hand, when it is rewritten in Hamiltonian 
form, the spectrum of the resulting Dirac operator 
includes not only the desired interval [mc?, oc) 
(where m is the electron mass and c the speed of 
light), but also an interval (—oo, —7nc?]. 

Dirac himself already considered this negative 
part of the spectrum as unphysical, since no such 
negative energies had been observed and their 
presence would entail instability of the electron. 
This physical flaw of the “first-quantized” descrip- 
tion of a relativistic electron led to the introduction 
of “second quantization," as encoded in quantum 
field theory. In the field-theoretic version of the 
Dirac theory, the unphysical negative energies are 
obviated by a prescription that originated in Dirac's 
hole theory. 

Specifically, Dirac postulated that the negative- 
energy states of his equation were occupied by a sea 
of unobservable particles, the Pauli principle 
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forbidding an occupancy greater than one. In this 
heuristic picture, the annihilation of a negative- 
energy electron yields a hole in the sea, observable 
as a new type of positive-energy particle with the 
same mass, but opposite charge. This led Dirac to 
predict that the electron should have an oppositely 
charged partner. 

His prediction was soon confirmed experimen- 
tally, the partner of the negatively charged electron 
showing up as the positively charged positron. More 
generally, all electrically charged particles (not only 
spin-1/2 particles described by the Dirac equation) 
have turned out to have oppositely charged anti- 
particles. Furthermore, some electrically neutral 
particles also have distinct antiparticles. 

Returning to the second-quantized Dirac theory, 
this involves a Dirac quantum field in which the 
creation/annihilation operators of negative-energy 
states are replaced by annihilation/creation opera- 
tors of positive-energy holes, resp. The hole theory 
substitution therefore leads to a Hilbert space (called 
Fock space) that accomodates an arbitrary number 
of particles and antiparticles with the same mass and 
opposite charge. 

Soon after the introduction of the Dirac equation 
(which dates from 1928), it turned out that the 
number of particles and antiparticles is not con- 
served in a high-energy collision. Such creation and 
annihilation processes admit a natural description in 
the Fock spaces associated with relativistic quantum 
field theories. The very comprehensive mathematical 
description of real-world elementary particle phe- 
nomena that is now called the standard model arose 
some 30 years ago, and has been abundantly 
confirmed by experiment ever since. It involves 


various relativistic quantum fields with nonlinear 
interactions. The Dirac quantum field is an essential 
ingredient, inasmuch as it is used to describe all 
spin-1/2 particles and antiparticles in the model 
(including quarks, electrons, neutrinos etc.). 

After this survey (which is not only very brief, 
but also biased toward the physical concepts at 
issue), the contents of this article will be sketched. 
The free Dirac equation associated with the 
physical Minkowski spacetime R^ is first detailed. 
The exposition and notation are slightly unconven- 
tional in some respects. This is because we are partly 
preparing the ground for a mathematically precise 
account of the second-quantized version of the free 
Dirac theory. For example, momentum space (as 
opposed to position space) is emphasized, since the 
variable x in the Dirac equation does not have a 
clear physical significance and should be discarded 
in the Hilbert space formulation of the second- 
quantized Dirac field. The latter acts on a Fock 
space of multi-particle and -antiparticle wave 
functions depending on momentum and spin vari- 
ables, and the spacetime dependence of the Dirac 
field is solely a consequence of relativistic covar- 
iance. (In particular, the variable x in the Dirac 
field Y(t, x) should not be viewed as the position of 
particles and antiparticles created and annihilated 
by the field.) 

To be sure, there is much more to the Dirac 
theory than its free first- and second-quantized 
versions for Minkowski spacetime R*. The primary 
purpose here is, however, to present these founda- 
tional versions in some detail. A much more 
sketchy account of further developments can be 
found in subsequent sections. First, the one-particle 
theory is reconsidered. Generalizations of the free 
theory to arbitrary dimensions and Euclidean 
settings are sketched and interactions with external 
fields are described, touching on various aspects 
and applications. 

The next focus is on relations with index theory 
that arise when the massless Euclidean Dirac 
operator is generalized to geometric settings, namely 
l-dimensional Riemannian manifolds allowing a spin 
structure. We illustrate the general Atiyah-Singer 
index theory for the Dirac framework with some 
simple examples for / — 1 (Toeplitz operators) and 
| —2 (the manifold S! x S! ). 

More information on the many-particle Dirac 
theory appears in the final section. Brief remarks 
on the Dirac field in interaction with other 
quantized fields are followed by an elaboration 
of the far simpler situation of the Dirac field 
interacting with external fields. Among the S- 
operators corresponding to such fields there is a 
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special class of unitary matrix multipliers; the 
external field then vanishes for t <0 and equals 
the pure gauge field corresponding to the unitary 
matrix for t > 0. Specializing to an even spacetime 
dimension and choosing special “kink” type 
unitaries, the associated Fock-space quadratic 
forms can be made to converge to the free Dirac 
field. 

As mentioned already, Dirac's second quantization 
procedure was invented to get rid of the unphysical 
negative energies of the first-quantized (one-particle) 
theory. It is an amazing fact that the resulting 
formalism for the simplest case (namely the massless 
Dirac operator in a two-dimensional spacetime) can 
be exploited for quite different purposes. In particu- 
lar, this setting can be tied in with various soliton 
equations and the representation theory of certain 
infinite-dimensional groups and Lie algebras. In 
conclusion, some of these applications are briefly 
sketched, namely the construction of special solutions 
to the Kadomtsev-Petviashvili (KP) equation (incl- 
uding the KP solitons and finite-gap solutions) and 
special representations of Kac- Moody and Virasoro 
algebras. 


The Free One-Particle Dirac 
Equation in R4 


The free time-dependent Dirac equation is a linear 
hyperbolic evolution equation for a function V(t, x) 
on spacetime R^ with values in C*. It involves four 
4x4 matrices y“, 4—0,1,2,3, satisfying the 
y-algebra 


ey t+ yt —2g"14, g=diag(1,—1,—1,—1) [1] 


Using the Pauli matrices 


0 1 0 -1 
"CAS oF FT hy D 
|2] 
EMI. 
"T e ^ad 
one can choose for example 
uisi OY 21 el o) 
1; 0 Op 0 
R=1,2,3 [3] 


Now the free Dirac equation reads 


(ib, + ihcy -V — mc*14)V(t, x) =f) [4] 
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where 5 is Planck's constant, c the speed of light, 
and m the particle mass. Using from now on units so 
that h=c=1, this can be abbreviated as 


3 
(Zra — m) W(x) =0, x= (x°,x!,x?, x?) 
mu 


ð, =0/dx", pu —0,1,2,3 [5] 


The relativistic invariance of this equation can be 
understood as follows. First, since the equation 
does not explicitly involve the spacetime coordi- 
nates, it is invariant under spacetime translations. 
(If W(t, x) solves [5], then also V(t — ao, x — a) is a 
solution for all (a9, a) € R*.) Second, it is invariant 
under Lorentz transformations (rotations and 
boosts). Indeed, if U(x) is a solution and Le 
SO(1,3), then S(L)W(L^x) solves [5] too, where 
S(L) denotes a (suitably normalized) matrix 
satisfying 


3 
SL)" ^S(L) = Y L^" é| 
v=0 


(The matrices 7” on the right-hand side of [6] satisfy 
the y-algebra [1]. From this, the existence of a 
representation S(L) of SO(1, 3) satisfying [6] is 
readily deduced.) 

As a consequence, the Poincaré group (inhomo- 
geneous Lorentz group) acts in a natural way on the 
space of solutions to the time-dependent Dirac 
equation, expressing its independence of the choice 
of inertial frame. For quantum mechanical purposes, 
however, one needs to choose a frame and use the 
associated time variable to rewrite the equation as a 
Hilbert space evolution equation. . 

The relevant Hilbert space H is the space of four- 
component functions that are square integrable over 
space, 


H = L?(R?, dx) @ C [7] 


To obtain a self-adjoint Hamiltonian on 7, one multi- 
plies [5] by 4? and introduces the Hermitian matrices 


B=", ata k-1,2,3 [8] 


Then, one obtains the Schródinger type equation 


.d 


ty) Hv [9] 


where H is the Dirac operator, 


H = -ia - V + Bm [10] 


Under Fourier transformation, 
F:Å — L?(R?, dp) 8 Ct 
x) = 6(p) = (27? f dx exp(-ix- pute) 
[11] 


eqn [9] turns into 


i= DØ), Dip)=a-p+hm [12 


The matrix D(p) is Hermitian and has square E14; 
where Ep is the relativistic energy, 


E, = (p.p m) [13] 
p 


corresponding to a momentum p. Now, we have 


UcD(—p) = —D(p)Uc [14] 
where Uc is the charge conjugation matrix, 
Uc — iy? [15] 


Hence, the four eigenvalues of D(p) are given by 
Ep, Ey, — Ey, and —Ep. Therefore, the matrices 


P.(p) —5 (1. " c) 16] 


are projections on the positive and negative spectral 
subspaces of D(p). 

As orthonormal base for the positive-energy sub- 
space, we can now choose 


2E, 1/2 | 
wp) = (gr) Bh, i= 12 117 
where 
1 0 
1 | 0 Te 
bı atai | n= 0 [18] 
0 1 
Next, setting 
w_j(p) = Uc, j=1,2 [9] 


an orthonormal base :w (—p), w (—p) for the 
negative-energy subspace of D(p) is obtained; cf. [14]. 

The upshot is that the time-independent Dirac 
equation 


Hy = Ey [20] 


gives rise to bounded eigenfunctions 


e. ;(x, p) 
= (21) "^ exp(ix-p)wij(p), j—1,2 
e (x, p) 
= (27) exp(—ix - p)w_,(p), 


[21] 


E 


with eigenvalues E = Ep and E = —E;, resp. Clearly, 
they are not square-integrable, but they can be used 
as the kernel of a unitary transformation between 
H (7) and the Hilbert space 


HK -—HOR.|-POLOP-H 


22 
H+, H- = L?(R?, dp) @ C? 221 


Specifically, we have 
WH 
f(b) = (f+(p),f-(p)) 
4) - Y. 5 | dpesj(e pfl) 23 


6=+,—j=1,2 


which entails 


(Wylp) =  dxemGp)-vix) — (4 


(Here and throughout this article, a bar denotes 
complex conjugation.) 

From the above, it is clear that the Dirac 
Hamiltonian H acting on the Hilbert space 7 is 
unitarily equivalent to the multiplication operator on 
H [22] given by 


(Hf )s(p) = 6Epfs(p), 


Indeed, W is a diagonalizing transformation for H, 
the relation 


0 一 十 ,一 [2.5] 


H=W 'HW [26] 


yielding an explicit realization of the spectral 
theorem. 

Using the same notational convention, the 
momentum, charge conjugation, parity, and time- 
reversal operators on H, given by 


(Pg) (x) = —i0,v(x) [27] 
(Cy) (x) = Ucw(x) [28] 
(Pw)(x) = Up(—x), Up = 人 [29] 


(Ty)(x) = Ur)(x), Ur = yy [30] 
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transform into the operators 


(Pif )s(p) = 6pufs(p), =+,- [31] 
(Cf)s(p) = f-s(p), 和 = 十 一 [32] 
(PA) —éfs(—p) ê= +,- [33] 
(Tf) (p) =iofs(-p), 6=+,- [B4 


Note that P,, P, and T leave the positive- and 
negative-energy subspaces H, and H_ invariant, 
whereas C interchanges them. 

To conclude this section, we describe some salient 
features of the unitary representation of the (identity 
component of the) Poincaré group on H, which 
follows from the representation on solutions to [5] 
already sketched. The spacetime translations over 
a € R^ are represented by the unitary operator 
exp (—1a9H + ia - P); explicitly, 


(exp(—iaoH + ia - P)f);(p) 
= exp(—ió(agE, — a - p))fs(b), 


The representation of the Lorentz group involves 
unitary 2x2 matrices U(k, A), where k is an 
arbitrary 4-vector satisfying k&"b, —1 and A the 
matrix in SL(2, C) representing L € SO(1, 3). (Recall 
that SL(2, C) can be viewed as a 2-fold cover of 
SO(1, 3).) In particular, U(k, A) does not depend on 
k for rotations, 


6=+,- [35] 


U(k,A) = A*, VA € SU(2) [36] 


(Here and henceforth, we use * to denote the 
Hermitian adjoint of matrices and operators.) For 
boosts, however, there is dependence on the vector 
k, which is the image of the vector (1, 0) under the 
boost. We refrain from a more detailed description 
of U(k, A), as this would carry us too far afield. 
The unitary SO(1, 3) representation leaves the 
decomposition #=P,H@®P_H invariant. On the 
positive-energy subspace H4, it is given by 


(U(L)F), (p) = (E) u(B.a) Ao) 37 


where 
p=(Epp), p'—-L^p [38] 


On H_, it is given by the complex-conjugate 
representation, 


(U(L)f)_(p) = (的 u(2.4) to 39 
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just as for the spacetime translations, cf. [35]. (The 
superscript t is used to denote the transpose matrix.) 
This feature is crucial for the second-quantized 
Dirac theory, which is discussed next. 


The Free Dirac Field in R’ 


The free Dirac field is an operator-valued distribution 
on a Fock space that describes an arbitrary number of 
spin-1/2 particles and antiparticles in terms of momen- 
tum space wave functions. Since spin-1/2 particles are 
fermions (which encodes the Pauli exclusion principle), 
an M-particle wave function pone (pi,..., Py) (where 
jı € (1, 2] is the spin index) is antisymmetric under any 
interchange of a pair (j; p;) and (jz, Pp). Likewise, 
N-antipartide wave functions Fi, (qi,..., qw) 
are antisymmetric. But a wave m E, 2 D. q) 
describing a particle-antiparticle pair need dot have 
any symmetry property, since a particle and an 
antiparticle can be distinguished by their charge. 

The relevant Fock space is therefore the tensor 
product of two antisymmetric Fock spaces built over 
the one-particle and  one-antiparticle spaces 
L?(R?, dp) & C?. For later purposes, it is important 
to view these spaces as the summands H, and H_ of 
the space H from the previous section. Thus, the 
arena for the free Dirac field is the Hilbert space 


FalH) ~ FaH) B Fa(H_) [40| 


where, for example, 
FaH) =(C@H@O(H@H),@::-) [41| 


where the bar denotes the completion of the infinite 
direct sum in the obvious inner product. The tensor 
(1,0,0,...) 1s viewed as the vacuum (the “filled 
Dirac sea") and denoted by 2. 

To get around in Fock space, one employs the 
creation and annihilation operators c'*)(f), f € H. 
The creation operator c'(f), f € H, is defined by 
linear and continuous extension of its action on the 
vacuum €? and on elementary antisymmetric tensors, 
recursively given by 


c(ffizf, Eha =f Afis- 
c'(f)tuü A- ^fN =f AfA AfN,... 


Its adjoint, the annihilation operator c(f), satisfies 


c(f)Q = 0, c(f )h = (f, fi)9 


N 
effi ^ ^fu = M CY (f. f) 


三 1 


x fi AAGA 


a2 


ETAT fN,.-- [43] 


Accordingly, the operators c'* (f) satisfy the canoni- 
cal anticommutation relations (CARs) over H, 


ic(f), c(g)} = 0, 
tef). e (9) = (f. 8g), 


where (A, B} denotes the anticommutator AB + BA. 
(From this, one readily deduces that c'""(f) is 
bounded with norm ||f ||.) 

Next, recalling the direct sum decomposition [22], 
a notation change 


(Paf) Pu". M5] 


is made, thus indicating that a) and b'*) should be 
viewed as the creation/annihilation operators of 
particles and antiparticles, resp. Since H} and H- 
are copies of L?^(R?, dp) & C^, a given function 
(fi(p), fo(p)) in the latter space can occur both as an 
argument of a'?)(.) and of b/?(.); it can also be 
viewed as a smearing function for unsmeared 
quantities a; (p) and p^ (p), j — 1, 2, that are often 
referred to as Operators as well (even though they 
are only quadratic forms). Thus, one has, for 
example, 


Vf.gc [44] 


(PLS) a) 


22 R? ap^; (p (p) 
[46] 


= > | dpb;(p)fi(p) 


As explained shortly, the smeared time-zero Dirac 
field takes the form 
D(F) =alP f) +b (KPF) fen W7 
Here and below, K denotes complex conjugation on 
H, H4, and H_. Just as the operators c (f), the 
operators P (f) satisfy the CARs over H, 


(9(f), ®(g)} = 0 48] 
(e(f) 9'(g) —(f.g) Vf.geH 
as is readily verified using [44]-[45]. But this 


-representation is not unitarily equivalent to the 
c-representation [44]. This becomes clear in parti- 
cular from the consideration of a crucial type of 
CAR automorphism that is considered next. 

To this end, we fix a unitary operator U on H. 
Then it is plain that the operators 


af) = d^ (uf) 49 
= e (uf) 50 
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also satisfy the CARs. The CAR-algebra automorph- 
ism c'* (f) — c" (f) can be unitarily implemented in 
F 4H), since one has 


c" (f) = T(U)e" (F)r(U") [51] 


where T(U) denotes the Fock-space product opera- 
tor corresponding to U. Thus, for example, 


MUR mQ.- PUN = UF ac. 52] 
(Ufi A- À fy = Ufi ^*^ UfN,..- 

For the CAR automorphism #* (f) — $ (f) this is 
not true, however. Rewriting it in terms of the 
annihilation and creation operators a) and b'*) via 
[47], it amounts to a linear transformation (Bogoliubov 
transformation), whose unitary implementability has 
been clarified several decades ago. To be specific, the 
necessary and sufficient condition for unitary imple- 
mentability is that the off-diagonal parts 


U,.=P,UP_, U_,=P_UP, [53] 


in the 2 x 2 matrix decomposition of operators on 
H be Hilbert-Schmidt operators. Therefore, no 
problem arises when U is diagonal with respect to 
this decomposition. Indeed, in that case one can 
choose as unitary implementer the product operator 


(WD =T(U @T(KU__K) [54] 


(cf. the tensor product structure [40] of F,(H)). 
In particular, the automorphism 


9 (f) — e(e"" f) [55] 


where H is the free diagonalized Dirac Hamiltonian 
[25], is implemented by the operator 


re) = T(e"F) Q pire [56] 


where E denotes multiplication by Ep on H, and H.. 
The change of CAR representation, therefore, entails 
that the unphysical negative energies of the one- 
particle theory are replaced by positive energies of 
antiparticles. Hence, we obtain a mathematically 
precise version of Dirac’s hole theory. substitution 
bip) 一 b? (p), b?(p) 一 bj(p). 

More generally, if one chooses for U the Poincaré 
group representation (given by [35] and [37]-[39]), 
then the Fock-space implementer [54] is the tensor 
product of two product operators with the same 
action on Z(L2(R?, dp) & C7). Observe that this is 
also true for the Fock-space version T(T)=T(T) of 
the time-reversal operator [34]. By contrast, the 
Fock-space parity operator T(P)=T(P) gives rise to 
two product operators with slightly different 
actions, cf. [33]. Accordingly, particles and anti- 
particles have opposite parity. 


The map 
$(f) = (Cf) [57] 


also yields a CAR automorphism. It is unitarily 
implemented by the Fock-space charge-conjugation 


operator 
ZUM 


which interchanges particles and  antiparticles. 
Notice that C is unitary, whereas C is antiunitary. 

It remains to establish the precise relation of the 
above to the customary free Dirac field Y(t, x). This 
is a quadratic form on F,(H) given by 


(t,x) = Q2)? | dp Y^ (apo. (pe tts 


j=1,2 


+} (py... (perte) [59] 


(Its expectation (Fi, Y(t, x)F2) is, for example, well 
defined for F;, Fə in the dense subspace of F,(H) 
that consists of vectors with finitely many particles 
and antiparticles and wave functions in Schwartz 
space.) It satisfies the time-dependent Dirac equation 


i0,U = (—ia- V + Bm)V (60) 


in the sense of quadratic forms. Furthermore, 
smearing it with a function w(x) in the Hilbert 
space H (7), we obtain 


d dxy(x) - W(t, x) 
JR? 
ss qp (e w-! w) 
—[p(e^P)e(w-'yyr(e*"") peň [61] 
As announced, the time evolution of the free Dirac 
field is, therefore, given by the unitary one- 
parameter group [56], whose generator (the sec- 
ond-quantized Dirac Hamiltonian) has spectrum 
(0) U [m, oc). 
The Dirac field Y(t, x) can also be smeared with a 


test function F(t, x) in the Schwartz space S(R^)^, 
yielding a bounded operator 


W(F)= | dxF(x)- U(x) [62] 
R^ 


Then one obtains the relativistic covariance relation 
T(U(a, L)W(F)T(U(a, L)Y = Ww(F^-) [63] 
where 


F^" (x) = S(L Y F(L^! (x — a)) [64] 
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and U(a, L) denotes the Poincaré group representa- 
tion on H, cf. [35] and [37|-[39]. Likewise, one gets 
the inversion formulas 


~ 


DI(DW(ET()-w(F) I-P,T [65] 
with 
Fp(t,x) = UbF(t,—x), Fr(t,x)—- UtF(—t,x) [66] 


while the Fock-space charge-conjugation operator 
[58] transforms the Dirac field as 


CV(F)C = W(Fc) [67] 

with 
Fc(x) = ULF(x) 68 
Finally, let us consider the global U(1) gauge 


transformations f — e'? f, where ó € R and f € H. 
They can be implemented by 


I'(e'^) — r(e^) & r(e ^) 69] 
and one has 
l'(e^)W(F)P(e^)* = Y (Fg) [70] 
with 
F4(x) = F(x) [71] 


The generator O of the one-parameter group 
ot+T\(e'”) is the charge operator: on wave functions 
describing N, particles and N antiparticles, it has 
eigenvalue N} — N_. 


More on the One-Particle Dirac Theory 


Even for the free one-particle setting, the account 
given earlier is far from complete. To begin with, 
the free Dirac equation admits a specialization to 
massless particles. In the Weyl representation of 
the y-algebra adopted above, the choice m=0 
entails that the p-space equation [12] decouples 
into two 2 x 2 equations for spinors that can be 
labeled by their chirality (“handedness”). This 
refers to their eigenvalue with respect to the 
chirality matrix 


| i. 0 
y =i" y'7¥ = ( a [72] 


and this notion derives from the noninvariance of 
the separate 2x2 equations under parity. (A 
positive-chirality spinor is mapped to a negative- 
chirality spinor under the parity operator P (33) and 
vice versa.) Since the weak interaction breaks parity 
symmetry, the two 2 x 2 equations (often called 
Weyl equations) do have physical relevance. Indeed, 


the associated quantum fields are a crucial ingredi- 
ent of the standard model. 

Next, we point out that it is possible to switch to 
a representation in which the gamma matrices are 
real. This so-called Majorana representation is 
convenient (but not indispensable) in the description 
of neutral spin-1/2 particles. By definition, such 
particles are equal to their antiparticles, so that the 
second-quantized formalism of the previous section 
must be adapted: one needs the neutral CAR algebra 
over H (also known as self-dual CAR). 

For various purposes, it is important to formulate 
the free Dirac equation for a spacetime whose 
spatial dimension is arbitrary. Then one needs, first 
of all, gamma matrices satisfying the (Minkowski) 
Clifford algebra relations 


yy" + = 28 In, g= diag(1,—1„) [73] 


where n is the space dimension and the minimal size 
A x A of the gamma matrices is to be determined. 

Clearly, for 4 —1 and » —2, one can take A — 2, 
choosing, for example, 


M6 $ m B 
Tha oF " "V—t 9 
, (i 0 
?= (0 a) 


to fulfill [73]. For »=4, one can take A=4, just 
as for n= 3, supplementing [1] with the matrix iy’, 
ek. [72]. 

More generally, for  — 2N — 1 and 4 — 2N, one 
can take A — 2^ in [73]. Indeed, a representation on 
the 2"-dimensional fermion Fock space £C) 
(cf. [41]) is readily constructed using the creation 
and annihilation operators described in the previous 
section. Once this has been taken care of, most 
of the discussion on the free one-particle Dirac 
equation in R^ can be easily generalized. Of special 
importance in this regard is the straightforward 
adaptation of the formulas [7]-[26], which form 
the foundation for the second-quantized version. 
Indeed, the discussion of the last section applies 
nearly verbatim for arbitrary spacetime dimension. 

In several applications, the so-called Euclidean 
version of the free Dirac theory in spacetime 
dimension n+ 1 is important. Basically, this version 
is obtained upon replacing 109 by 9,,4 in the Dirac 
equation, a substitution that changes the character 
of the equation from hyperbolic to elliptic. Pro- 
vided that the mass vanishes, the Euclidean Dirac 
equation admits a reinterpretation as a time- 
independent zero-eigenvalue Weyl equation in a 
Minkowski spacetime of dimension n+ 2. (This 
equation is often called the zero-mode equation.) 


[74] 


Let us now turn to the description of the 
interaction with an external electromagnetic poten- 
tial A,,(t, x). This can be taken into account via the 
minimal substitution, 


Ôu — Oy + 1eA,, [75] 


also known as the covariant derivative, in the time- 
dependent Dirac equation [5]. 

For the electron in the Coulomb field of a nucleus 
of charge Ze, one has 


Ze 
=0, k= 1.2.3. Ao =-—— 7 
A, = 0, k 3. 0 re [76] 


and the time-independent equation 


-ig V+ pm- = Ev [77] 


can be solved explicitly. This leads to a bound-state 
spectrum that is more accurate than its nonrelativis- 
tic counterpart. In particular, one finds that energy 
levels that are degenerate in the nonrelativistic 
theory split up into slightly different levels. The 
resulting fine structure of the Dirac levels can be 
understood as a consequence of the coupling 
between the spin of the electron and its orbital 
motion. 

In spite of this better agreement with the 
experimental levels, the physical interpretation of 
the Dirac electron in a Coulomb field is enigmatic. 
This is not only because of the persistence of the 
negative-energy states of the free theory (which 
turn into scattering states), but also because of 
unphysical properties of the position operator. 
More general time-independent external fields 
(such as step potentials Ao(x) with a step height 
larger than 2m) can cause transitions between 
positive- and negative-energy states (Klein para- 
dox). This phenomenon is enhanced when time 
dependence is allowed. In particular, any external 
field that is given by functions in Ce (R*) leads to 
a scattering operator 5 on the one-particle space H 
[22] that has nonzero off-diagonal parts S++. 
Hence, a positive-energy wave packet scattering 
at such a time- and space-localized field has a 
nonzero probability to show up as a negative- 
energy wave packet. 

When one tensors the one- “particle space H with 
an internal symmetry space C*, one can also 
couple external Yang-Mills fields A, taking values 
in the k x & matrices via the substitution [75]. 
(From a geometric viewpoint, this can be 
rephrased as tensoring the spinor bundle with a 
vector bundle equipped with a connection A.) The 
generalization of this external gauge field coupling 
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to a Minkowski spacetime or Euclidean space of 
arbitrary dimension is straightforward. An adapta- 
tion of the resulting interacting one-particle Dirac 
theory in arbitrary dimension to quite general 
geometric settings also yields a crucial starting 
point for index theory. 

Before turning to the latter area, we conclude this 
section with another striking application of the one- 
particle framework, namely the massless Dirac 
equation in two spacetime dimensions with special 
external fields. Specifically, the relevant Dirac 
operator is of the form 


id —iq(x) 
ir(x) —1 i dd 


where r(x) and q(x) are not necessarily real valued. 
(Note that this operator is in general not self- 
adjoint.) With suitable restrictions on r and q, 
the direct and inverse scattering theory associa- 
ted with the Dirac operator [78] can be applied 
to various nonlinear PDEs in two spacetime 
dimensions to solve their Cauchy problems in 
considerable detail. As a crucial special case, 
initial conditions yielding vanishing reflection 
give rise to soliton solutions for the pertinent 
equation. 

The first example in this framework was found by 
Zakharov and Shabat (the nonlinear Schródinger 
equation); with other choices of r and q several other 
soliton PDEs (including the sine-Gordon and mod- 
ified Korteweg-de Vries equations) were handled by 
Ablowitz, Kaup, Newell, and Segur, who studied a 
quite general class of external fields r and q. 


The Dirac Operator and Index Theory 


Thus far, we have considered various versions 
of the Dirac operator associated with the spaces 
R’ for some / > 1. For applications in the area 
of index theory, however, one needs to generalize 
this base manifold. Indeed, one can define a Dirac 
operator for any /-dimensional oriented Rieman- 
nian manifold M that admits a spin structure. 
This is a lifting of the transition functions of the 
tangent bundle TM (which may be assumed to 
take values in SO(/)) to the simply connected 
twofold cover Spin(/) (taking / 3). 

Choosing first / — 2N + 1, the spin group has a 
faithful irreducible representation on C^. Hence, 
one obtains a C^ -bundle over M, the spinor 
bundle. The Levi-Civita connection on M derived 
from the metric can now be lifted to a connection 
on the spinor bundle. From the covariant deriva- 
tive corresponding to the spin connection and the 
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Clifford algebra generators y!,...,7/, one can 
then construct a first-order elliptic differential 
operator that acts on sections of the spinor 
bundle. (For the case M=R?N*! with its Eucli- 
dean metric, this construction yields the massless 
positive-chirality Dirac operator acting on wave 
functions with 2" components, as considered 
above.) 

The massless Dirac operator thus obtained is 
self-adjoint as an operator on the L?-space H 
associated with the spinor bundle, and it has 
infinite-dimensional positive and negative spectral 
subspaces H} and H_. (In this section the check 
accent On position-space quantities is omitted.) 
Specializing to the case of compact M, a contin- 
uous map from M to C^ gives rise to a Fredholm 
operator on H4, and more generally a continuous 
map from M to GL(k, C) yields a Fredholm 
operator on H+ ® e. 

For a smooth map, the Fredholm index of this 
operator can be written in terms of an integral over 
M involving certain closed differential forms. The 
value of this integral does not change when exact 
forms are added, since M has no boundary. Hence, 
one is dealing with de Rham cohomology classes. In 
this context, the class involved (“characteristic 
class") is determined by the Riemann curvature 
tensor of M and the topological (“winding”) 
characteristics of the map. 

The simplest example of this state of affairs arises 
for |— 1 and M — S! with its obvious spin structure 
(periodic boundary conditions). Writing vw € H = L? 
(S!) as 


= » ant’, 2€ s! [79] 


nc 7, 


the Dirac operator H on H reads 


chis [80] 


It has eigenfunctions z”, n € Z. Thus, 
choose 


we may 


(P44) (z 


= anz", (Pye) =J az" [81 


n>0 n<0 


As a consequence, the functions in 74,( ) are 
L^-boundary values of holomorphic functions in 
|z| < 1 (|z| » 1). Operators of the form 


Ty = P,M,P, [82] 


where w is a continuous function on S! and M, 
denotes multiplication by w, are called Toeplitz 
operators. It is not hard to see that they are Fredholm 


(viewed as operators on H4), provided that ~ does not 
vanish on $!. (Recall a bounded operator B is 
Fredholm if it has finite-dimensional kernel K and 
cokernel C. Its Fredholm index is given by 


index(B) = dim K — dim C [83] 


and is norm continuous and invariant under addi- 
tion of a compact operator.) Assuming w($!) c C* 
from now on, the curve v(S!) has a well-defined 
winding number w(7) with respect to the origin. The 


equality 
index(Ty,) = —w(w) [84] 


between objects from the area of analysis on the 
left-hand side and from the areas of topology and 
geometry on the right-hand side is the simplest 
example of an Atiyah-Singer type index formula. 
When : is not only continuous but also smooth, 
the index formula can be rewritten as 


dv 


-元 S1 a 


index(T,,) = [85] 
yielding a characteristic class version. 

It should be noted that the operator My on H 
has a bounded inverse M,;,, when 0£ v(S!), hence a 
trivially vanishing index. Therefore, the compres- 
sion [82] involving the spectral projection of the 
Dirac operator is needed to get a nonzero index. 
Observe also that the equality [84] is quite easily 
verified for the case v(z)—z", since T, yields a 
power of the right (n > 0) or left (n < 0) shift on 
PH = P(N). 

We proceed to the case of even- -dimensional 
manifolds, | — 2N. Then the fiber C? of the spinor 
bundle splits into a direct sum of even and odd 
spinors, corresponding to two distinct representa- 
tions of Spin(2N) on C?" (Here it is assumed 
that N » 3; recall the Lie algebra isomorphisms 
so(4) > ol DB so(3) and so(6) ~ su(4).) With respect 
to this decomposition, the Dirac operator can be 


written as 
0 T 
Hz 6 
oq 86) 


where D and D* are again first-order elliptic 
differential operators expressed in terms of Clifford 
algebra generators and the spin connection. Tensor- 
ing the spinor bundle with a vector bundle equipped 
with a connection A, one can define a Dirac 
operator on the tensor product which involves A 
and takes the form 


(27 € 


with respect to the even/odd spinor decomposition. 
Once more, the index of D4 (viewed as a Fredholm 
operator between two different Hilbert spaces) can 
be expressed as an integral over M involving 
characteristic classes that depend on the curvatures 
of the two connections. 

Probably the simplest example of the construc- 
tions just sketched is given by the torus M — $! x S! 
with its flat metric. Employing the above coordinate 
and spin structure on $!, one can take 

H=L7(S'xS')@C’, pid d 188| 

O21 Oz» 

Since the curvature vanishes, the index theorem for 
this situation implies index(D) —0. (Note that this is 
also plain from [88]: both kernel and cokernel of D 
are spanned by the constant sections.) On the other 
hand, when one tensors the spinor bundle with a line 
bundle with connection A, the index formula reads 


1 


index(D,) = E= | 
S! xS 


F [89] 
where F is the curvature 2-form corresponding to A. 

The Atiyah-Singer index theorem for Dirac 
operators has far-reaching applications. It can be 
used to derive other results in this area, such as the 
Gauss-Bonnet-Chern theorem, the Hirzebruch sig- 
nature theorem, and (when M is a Kahler manifold) 
Riemann-Roch type theorems. From this, one can 
obtain information on various questions, such as the 
existence of positive scalar curvature metrics or 
zeros of vector fields on M. Other applications 
include insights on topological invariants of mani- 
folds obtained from “simple” manifolds (such as 
spheres and tori) by glueing or covering operations. 
This hinges on the additive properties of the index 
that are clear from its being given by an integral 
over the manifold. Conversely, the integrality of 
Fredholm indices can be used to deduce that certain 
rational cohomology classes are actually integral on 
manifolds that admit the structure that is required 
for the pertinent index theorem to apply, that 
certain manifolds do not admit such structures, 
since one knows that the relevant class is not 
integral, etc. 


More on the Dirac Field 


As mentioned earlier, the free-field formalism can be 
easily generalized to an arbitrary spacetime dimen- 
sion d. For d 4, however, no renormalizable 
interacting quantum field models involving the 
Dirac field are known. For the physical case d —4 
the standard model involves various Dirac fields 
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interacting with quantized gauge fields and Klein- 
Gordon fields. Although its perturbation theory is 
renormalizable, its mathematical existence is to date 
wide open. 

It is far beyond the scope of this article to 
elaborate on the analytical difficulties of relativistic 
quantum field theories, let alone those associated 
with the standard model. Even for d —2 and 3, a 
nonperturbative construction of interacting quan- 
tum field models. involving the Dirac field is an 
extremely difficult enterprise. Apart from some 
rigorous results on certain self-interacting Dirac 
field models, the only interacting model that is 
reasonably well understood from the constructive 
field theory viewpoint is the Yukawa model for 
d — 2. and 3. This describes the interaction between 
the Dirac field Y and a Klein-Gordon field à, the 
interaction term being formally given by g(¥*7°V)¢. 

On the other hand, the interaction of the quantized 
Dirac field with external classical fields is much more 
easily understood and analytically controlled. As a 
bonus, within this context, one can make contact 
with various issues of physical and mathematical 
relevance. We now proceed to sketch the external- 
field framework and some of its applications. 

Let us first consider the addition of an external 
field term gV(t, x) to the free Dirac operator H on 


Å = L?(R",dx) @C’ @C [90] 


We assume from now on that the coupling g is real 
and that V is a self-adjoint kA x kA matrix-valued 
function on spacetime R'"*! with matrix elements 
that are in Coin ). Then the (interaction picture) 
scattering operator S exists. It is unitary and has off- 
diagonal Hilbert-Schmidt parts Sis, so that a 
unitary Fock-space S-operator I'(S) implementing 
the Bogoliubov transformation generated by S 
exists: 


LT(S)®(AT(S) = (Sf), vfe^ [91] 


The arbitrary phase in l'(S) can be fixed by requiring 
that the vacuum expectation value of I(S) be 
positive. More precisely, this number is generically 
nonzero and satisfies 


IQ, P(S)Q)| = det(1 + T4) "^ [92] 


where Ts is a positive trace class operator deter- 
mined by S. 

The vector I'(S)Q is a superposition of wave 
functions with an equal and arbitrary number of 
particles and antiparticles. More generally, the 
Fock-space S-operator I'(S) leaves the subspaces of 
F,(H) with a fixed eigenvalue q € Z of the charge 
operator QO invariant, and can create and 
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annihilate an arbitrary number 
antiparticle pairs. 

The unitary propagator U(Ti, T5) corresponding 
to V(t,x) does not have Hlilbert-Schmidt off- 
diagonal parts (unless the spacetime dimension is 
sufficiently small and special external fields are 
chosen). Even so, the diagonal parts are Fredholm 
with vanishing index, and the off-diagonal parts are 
compact. Omitting the ill-defined determinantal 
factor, these properties imply that one obtains a 
renormalized quadratic form Tren(U(T1, T2)) satisfy- 
ing the implementing relation 


of particle- 


~ 


Fiat U(T4, T2)) ®(f) 


~ 


A 6(U(T}, Taf Wal U, T2)), Vf E fi [93] 


in the quadratic form sense. 

The above unitary operators on H yield Fredholm 
diagonal parts whose indices vanish. (They are norm 
continuous in g and reduce to the identity for g — 0.) 
This is why their Fock-space implementers leave the 
charge sectors invariant. Indeed, for a unitary 
operator U on H with compact off-diagonal parts 
the implementer maps the charge-q sector to the 
charge-(g + q(U)) sector, where 


q(U) = index(U__) [94] 
Specializing to the case 


n-2N-1, A=2), A22"! [9.5] 


a unitary (kA x kA)-matrix multiplier U on 77 does 
not have compact off-diagonal parts in general. But 
when it is of the form 


Ü [P 0 


E 0 1, @u_(x) 6 


with respect to the chiral decomposition (the 
generalization of the 5?-decomposition [72] to even 
spacetime dimension), then it suffices for compact- 
ness of the off-diagonal parts that the matrices 
u(x) € U(k) are continuous and converge to 1, for 
|x| — oo. 

Viewing R^"! as arising from S?N-! via stereo- 
graphic projection, the latter unitaries can be viewed 
as continuous maps from S*N~! to U(k), reducing to 
1, at the north pole. As such, they yield elements of 
the homotopy group 72n_1(U(k)). By virtue of Bott's 
periodicity theorem, the latter group equals Z for 
k>N. Thus, the maps u, have a well-defined 
“winding number" w(u+) € Z for k > N. From the 
index formula 


index(U__) = w(u,) — w(u_) [97] 


and [94] one now deduces that one can obtain 
implementers Ten(U) effecting a nonzero charge 


change from unitary maps with nonzero winding 
number. 

In particular, choosing k —A—2^^! > N, there 
exist quite special *kink maps" 


ueg(x)€U(A), e»0,aeR?"" [98] 


with winding number 1 and such that the quadratic 
form implementers of the unitary multiplication 
operators 


, 1, Q ue a(x) 0 
U Lea m oem 
0 1, 6 1) 


à 1, € 1, 0 
U ra —=t=s | 
0 1) & Ue a(—X) 


converge to (a linear combination of the chiral 
components of) the free Dirac field Y(0, a) as the 
kink size parameter e goes to 0. 

For the special case N = 1, one can take 


[99] 


x = d = 1E 


Ue q(x) = [100] 


ah le 
and the off-diagonal parts of Ux. are actually 
Hilbert-Schmidt. Thus, the implementers can be 
chosen to be unitary operators. But to get con- 
vergence to the Dirac field components WV(0, a), as 
c — 0, the unitary implementers ['(U...,) should be 
renormalized by a multiplicative factor. 

For the N=1 case, the unitary multipliers [96] 
give rise to loop groups. Indeed, requiring 


lim us(x)=1,, 6=+4+,- 


x—+00 


101] 


we are dealing with continuous maps S! — U(k). 
From the viewpoint of the Dirac theory, these 
groups are local gauge groups. The convergence to 
the Dirac field just sketched can be used to great 
advantage to clarify the structure of the correspond- 
ing Fock-space gauge groups. Their Lie algebras 
yield representations of Kac-Moody algebras, a 
topic which is considered shortly. 

Before doing so, it should be pointed out that 
under some mild smoothness assumptions all of 
the above unitary matrix multipliers can also be 
viewed as S-operators associated with very special 
external fields. Indeed, the gauge-transformed Dirac 
operator 


w 


Hy =U AU [102] 


is of the form 


103] 


where V(x) is a self-adjoint kA x kA matrix on 
R 人 -1 (a “pure gauge” field). If one now defines a 
time-dependent external field by 


V(t, x) = | "an 


then U equals the S-operator for V(t, x). (Equiva- 
lently, Ú is the t > oo wave operator for the time- 
independent external field V(x).) 

To conclude this section we sketch some applica- 
tions of the second-quantized Dirac formalism for 
the special case N = 1, »: — 0, and positive chirality. 
Even though we could stick to the massless positive- 
chirality Dirac operator —id/dx on the line, it is 
simpler and more natural to start from its counter- 
part on the circle already considered in the last 
section, cf. [80]. (Under the Cayley transform, the 
positive- and negative-energy subspaces of —id/dx 
on L^(R) correspond to those of zd/dz on L?(S!), 
given by [81].) Letting z= e", we then obtain 


H = —id/d0, Ñ = L?([0, 27], d0) 
H= l(Z), H4 = P(N), H- = P(Z_) 


t>0 


ner: [104] 


(105] 


and a corresponding Dirac field 


V (ft, 0) — (22) '? » ge tind 4 ‘> ien 


n=0 n=1 


(t,0) € R x [0,27] [106] 


where 


aj—c(e), 120, b,=c(e),1<0 


and (ej]j-z; is the canonical basis of È (Z). 

Consider now the group GL(H) of bounded 
operators on # with bounded inverses. The 
transformation 


[107] 


e'(f)-9*(Gf) (f) 9(G f) 


feH, GeGL(H) [108] 


leaves the CAR [48] invariant. Provided that G 
belongs to the subgroup 


G2 (t) 


= {G € GL(H)| G++ Hilbert-Schmidt} [109] 
there exists an implementer l'(G) on F,(H): 
(G)e*(f) = 9'(Gf)T(G), 
i(G)®(f) = é(G-"f)'(G), vfe^ [10 
In particular, the multiplication operator 
exp(h(x)), h(x) = »" z—e" [111] 
k-1 
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belongs to G2(H) provided the sequence x, vanishes 
sufficiently fast as k — oo. Thus, one obtains an 
implementer [(e”)), the so-called KP evolution 
operator. This designation is justified by the vacuum 
expectation value 


r(x) = (Q,P(e"?)P(G)Q), GeG(H) [112] 


being a tau-function solving the hierarchy of KP 
evolution equations in Hirota bilinear form, as first 
shown by Sato and his Kyoto school. For example, 
the KP equation itself, 


uy, = Ôx ($ Ut 一 2uu — ugs.) [113] 
has the bilinear form 
(s m ^5. + T5) r(x + y)r(x — »| 
=0 [114] 
the relation being given by 
Xq 4, X294, 三 = 267 ln7 [115] 


The class of solutions to [113] thus obtained 
includes not only the rational and soliton solutions 
(which correspond to choosing G as multiplication by 
a rational function of z — e that does not vanish on 
S'), but also the finite-gap solutions associated with 
compact Riemann surfaces. Moreover, for suitable 
subgroups of G2(H), one obtains tau-functions for 
related soliton hierarchies, including the Korteweg-de 
Vries, Boussinesq and Hirota—Satsuma hierarchies. 
Even though the class of solutions associated with 
G2(H) via the Dirac formalism is large, it should be 
noted that from the perspective of the Cauchy problem 
for the pertinent evolution equations the solutions are 
nongeneric, inasmuch as the initial data are real- 
analytic functions. 

Finally, we consider Lie algebra representations 
related to the above special starting point [105] for 
the second-quantized Dirac framework. Assume that 
exp(tA) is a one-parameter group of bounded 
operators on H with generator A in the Lie algebra 
of G2(H), 


g2(7) 


= {A bounded | A+, Hilbert-Schmidt} [116] 
Then one can take 
l'(exp(A)) = exp(tdI’(A)) [117] 


where dĪ(A) is the Fock-space operator uniquely 
determined up to an additive constant by its 
commutation relation 


[dI(A),*(f)) = *(Af), Vf en [118] 
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with the smeared Dirac field *(f). Fixing the 
constant by requiring 
(Q, dT (A)Q) = 0 [119] 


the map A > dT (A) satisfies the Lie algebra relations 


[dP(A), dP(B)] = dP([A, B] + CA, B1 [120] 
so that the term 
C(A, B) = tr(A.,B,.. — B. ,A,.)) [121] 


encodes a central extension of the Lie algebra g2(H) 
[116]. 

The developments sketched in the previous 
paragraph are in fact independent of the specific 
form of the Hilbert space H and its H4/H- 
decomposition. But the special feature of the choice 
[105] and its S! — R analog is that the smeared 
Dirac current 


2" 


d0w(0):9*(0,0)9(0,0):; weCr(S') [122] 


0 
(where the double dots denote normal ordering - the 
replacement of terms involving bb? by —bf bg) is of 
the form dÎ(A„) with A» € g2(H) determined by v. 
(For spacetime dimension d » 2, this is no longer 
true, as the Hilbert-Schmidt condition is violated.) 
Moreover, [120] reduces to 


[d (A,),dT(A;)] = C(Ay,Ag)1 — [123] 
with the central extension explicitly given by 
: 2m 
C(Ay, Ag) = = dO w'(0)0(8) [124] 
T JO 


We have just sketched the details of the (simplest 
version of the) Dirac current algebra: the term [124] 
is commonly known as the Schwinger term, so that 
the central extension featuring in [120]-[121] may 
be viewed as a generalization. The above setup can 
also be slightly generalized so as to obtain repre- 
sentations of the Virasoro algebra, which is a central 
extension of the Lie algebra of polynomial vector 
fields on S'. The general framework has a quite 
similar version for the neutral Dirac field (Majorana 
field), described in terms of the self-dual CAR 
algebra. In the neutral setting, one can construct 
the Neveu-Schwarz and Ramond representations of 
the Virasoro algebra, which are crucial in string 
theory. 

Tensoring H with an internal symmetry space c! 
and starting from the Lie algebra of rational maps 
S! — sl(k, C), zı M(z), with poles occurring solely 


at z=0 and z= (regarded as multiplication 
operators on L?(S! Y*), the Fock-space counterparts 
obtained via the dl-operation yield representations 
of the Kac- Moody Lie algebra AU “1. Specifically, on 
the charge-0 sector of F,(H), one obtains the so- 
called basic representation, whereas the charge-q 
sectors with g=1,..., k — 1, yield the fundamental 
representations. Using the neutral version of Dirac's 
second quantization, one can also obtain the 
basic and a fundamental representation of the 
Kac-Moody algebras B (for k — 2I + 1) and D 
(for k — 2l). 


See also: Bosons and Fermions in External Fields; 
Clifford Algebras and Their Representations; Current 
Algebra; Dirac Fields in Gravitation and Nonabelian 
Gauge Theory; Gerbes in Quantum Field Theory; 
Holonomic Quantum Fields; Index Theorems; Quantum 
Field Theory in Curved Spacetime; Quantum 
Chromodynamics; Random Walks in Random 
Environments; Relativistic Wave Equations Including 
Higher Spin Fields; Solitons and Kac—Moody Lie 
Algebras; Spinors and Spin Coefficients; Symmetry 
Classes in Random Matrix Theory. 


Further Reading 


Bjorken JD and Drell SD (1964) Relativistic Quantum Mechanics. 
New York: McGraw-Hill. 

Carey AL and Ruijsenaars SNM (1987) On fermion gauge 
groups, current algebras and Kac-Moody algebras. Acta 
Applicandae Mathematicae 10: 1-86. 

Date E, Jimbo M, Kashiwara M, and Miwa T (1983) Transfor- 
mation groups for soliton equations. In: Jimbo M and Miwa T 
(eds.) Proceedings of RIMS Symposium, Nonlinear Integrable 
Systems — Classical Theory and Quantum Theory, pp. 39-119. 
Singapore: World Scientific. 

Dirac PAM (1928) The quantum theory of the electron. 
Proceedings of tbe Royal Society of London. Series A 117: 
610—624. 

Dirac PAM (1928) The quantum theory of the electron, II. 
Proceedings of tbe Royal Society of London. Series A 118: 
351-361. 

Glimm J and Jaffe A (1981) Quantum Physics. New York: 
Springer. 

Itzykson C and Zuber JB (1980) Quantum Field Theory. New York: 
McGraw-Hill. 

Pressley A and Segal G (1986) Loop Groups. Oxford: Clarendon. 

Rose ME (1961) Relativistic Electron Theory. New York: Wiley. 

Ruijsenaars SNM (1989) Index formulas for generalized Wiener- 
Hopf operators and boson-fermion correspondence in 2N 
dimensions. Communications in Mathematical Physics 124: 
553-593. 

Schweber SS (1961) An Introduction to Relativistic Quantum 
Field Theory. Evanston, IL: Row-Peterson. 

Streater RF and Wightman AS (1964) PCT, Spin and Statistics, 
and All Tbat. New York: Benjamin. 

Thaller B (1992) Tbe Dirac Equation. New York: Springer. 


Dispersion Relations 


J Bros, CEA/DSM/SPhT, CEA/Saclay, Gif-sur-Yvette, 
France 


| 
© 2006 Elsevier Ltd. All rights reserved. 


Introduction 


Dispersion relations constitute a basic chapter of 
mathematical physics which covers various types of 
classical and quantum scattering phenomena and 
illustrates in a typical way the importance of general 
principles in theoretical physics, among which 
causality plays a major role. Each such phenomenon 
is described in terms of a scattering amplitude F(w), 
which is a complex-valued function of a frequency 
variable w; in quantum physics, this variable 
becomes an energy variable called E (or s in particle 
physics), as it follows from the fundamental de 
Broglie relation E= hw. The real and imaginary 
parts of F(w), which are called respectively the 
dispersive part D(w) and the absorptive part A(w) of 
F, have well-defined physical interpretations for all 
these phenomena; they represent quantities which 
are essentially accessible to measurements. The term 
dispersion relations refers to linear integral equa- 
tions which relate the functions D(w) and A(w); such 
integral equations are always closely related to the 
Cauchy integral representation of a subjacent holo- 
morphic function F(w\)) of the complexified fre- 
quency (or energy) variable w'). F(w'*)) is called the 
holomorphic scattering function or in short the 
scattering function, and the scattering amplitude 
appears as the boundary value of the latter, taken at 
positive real values of w from the upper half-plane of 
w), namely 


F(w) = lim F(w+ie), ¢>0 


Historically, the first relations of that type to be 
obtained were the Kramers-Kroónig relations (1926), 
which concern the propagation of, light in a 
dielectric medium. In this basic example, F(w) 
represents the complex refractive index of the 
medium z'(w)— n(w)-4-1&(w) for a monochromatic 
wave with frequency w. The dispersive part D(w) is 
the real refractive index m(w), which is the inverse 
ratio of the phase velocity of the wave in the 
medium to its velocity c in the vacuum: the fact that 
it depends on the frequency w corresponds precisely 
to the phenomenon of dispersion of light in a 
dielectric medium. A slab of the latter thus appears 
as a prototype of a macroscopic scatterer. The 
absorptive part A(w) is the rate of exponential 
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damping x(w) of the wave, caused by the absorption 
of energy in the medium. 

It has appeared much later that for many 
scattering phenomena, dispersion relations can be 
derived from an appropriate set of general physical 
principles. This means that inside a certain axio- 
matic framework these relations are model indepen- 
dent with respect to the detailed structure of the 
scatterer or to the detailed type of particle interac- 
tion in the quantum case. 

In a very short and oversimplifying way, the 
following logical scheme holds. At first, one can say 
that any mathematical formulation of a physical 
principle of causality results in support-type proper- 
ties with respect to a time variable ? of an 
appropriate “causal structural function” R(t) of the 
physical system considered: typically, such a causal 
function should vanish for negative values of t. It 
follows that its Fourier transform R admits an 
analytic continuation R® in the upper half-plane 
of the corresponding conjugate variable, interpreted 
as a frequency (or an energy in the quantum case): 
here is the general reason for the occurrence of 
complex frequencies and of holomorphic functions 
of such variables. In fact, the relevant holomorphic 
scattering function F(w'*) always appears as gener- 
ated by R via some (more or less sophisticated) 
procedure: in the simplest case, F coincides with R'? 
itself, but this is not so in general. Finally, the 
derivation of suitable analyticity and boundedness 
properties of F(u9) in a domain whose typical form 
is the upper half-plane, allows one to apply a 
Cauchy-type integral representation to this function; 
the dispersion relations directly follow from the 
latter. 

The first part of this article aims to describe the 
most typical dispersion relations and their link 
with the Cauchy integral. It then presents two 
basic illustrations of these relations, which are: (1) 
in classical physics, the Kramers-Krónig relations 
mentioned above, and (2) in quantum physics, the 
dispersion relations for the forward scattering of 
equal-mass particles. The aim of the subsequent 
parts is to give as complete as possible accounts of 
the derivation of the relevant analyticity domains 
inside appropriate axiomatic frameworks which, 
respectively, contain the previous two examples. 
The simplest axiomatic framework is the one 
which governs all the phenomena of linear 
response: in the latter, the proof of analyticity 
and dispersion relations most easily follows the 
logical line sketched above. It will be presented 
together with its application to the derivation of 
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the Kramers-Krónig relations. The rest of the 
article is devoted to the derivation of the so-called 
crossing analyticity domains which are the relevant 
background of dispersion relations for the two- 
particle scattering (or collision) amplitudes in 
particle physics. This derivation relies on the 
general axiomatic framework of relativistic quan- 
tum field theory (QFT) (see Axiomatic Quantum 
Field Theory) and more specifically on the “analy- 
tic program in complex momentum space" of the 
latter. This framework, whose rigorous mathema- 
tical form has been settled around 1960, represents 
the safest conceptual approach for describing the 
particle collision processes in a range of energies 
which covers by far all those that can be produced 
and will be produced in the accelerators for 
several decades. A simple account of the field- 
theoretical axiomatic framework and of the logical 
line of the derivation of dispersion relations will 
be presented here for the simplest kinematical 
situations. A broader presentation of the analytic 
program including an extended class of analyticity 
properties for the general structure functions and 
(two-particle and multiparticle) collision ampli- 
tudes in QFT can also be found in this encyclope- 
dia (see Scattering in Relativistic Quantum Field 
Theory: The Analytic Program). For brevity, we 
shall not treat here the derivation of dispersion 
relations in the framework of nonrelativistic 
potential theory. Concerning the latter, the inter- 
ested reader can refer to the book by Nussenzweig 
(1972). A collection of old basic papers on field- 
theoretical dispersion relations can be found in the 
review book edited by Klein (1961). For a recent 
and well-documented review of the multiplicity of 
versions and applications of dispersion relations 
and their experimental checking, the reader can 
consult the article by Vernov (1996). 


Typical Dispersion Relations 


The possibility of defining the scattering function 
F(w*)) in the full upper half-plane and of exploiting 
the corresponding boundary value F of F on the 
negative part as well as on the positive part of the 
real axis will depend on the framework of considered 
phenomena. For the moment, we do not consider the 
more general situations which also occur in particle 
physics and will be described later (“crossing 
domains" and *quasi-dispersion-relations"). 

In the simplest cases, the real and imaginary parts 
D and A of F are extended to negative values of the 
variable w via additional symmetry relations result- 
ing from appropriate "reality conditions." As a 
typical and basic example, there occurs the 


symmetry relation F(u9) = F(—w)), (with w® and 
—w'*) in the upper half-plane) and correspondingly 
D(w) = D(—w), A(w) = — A(—w) on the reals; we shall 
call (S) this symmetry relation. 

The simplest case of dispersion relations is then 
obtained when D and A are linked by the reciprocal 
Hilbert transformations: 


1 Too 
w) = =P f A(w 
n 一 CC 


A(w) = -1r [ Dw) — do 1b 


T E f 


l / 
= dw [1a] 


where P denotes Cauchy’s principal value, defined 
for any differentiable function (x) (sufficiently 
regular at infinity) by 


p [Pas 


e € x 


ME dx +00 
= tim] f eem | ee [2] 


As a matter of fact, the pair of equations [1a], [1b] is 
equivalent to the following relation for F — D + iA: 


F(w) = af? 


The latter is obtained as a limiting case of the 
Cauchy formula 


(u!) lim dw (3 


1 Toc 


F(u’) 


Pi Ke?) = —— se A 
Ci a ae 


du’ [4] 
expressing the fact that F is holomorphic and 
sufficiently decreasing at infinity in the upper half- 
plane T, of the complex variable w' and that F(w) 
is the boundary value of F(w)) on all the reals. 

Finally, one checks that in view of the symmetry 
relation (S), the Hilbert integral relations between D 
and A given above reduce to the following disper- 
sion relations: 


2 +00 / 1 
2 too 1 
A(w) = -Æp | D(À)———7;dw [5b 


Two Basic Examples 


1. The Kramers-Krónig relation in classical optics 
It will be shown in the next part that the complex 
refractive index n’(w)=n(w)+ix(w) of a dielectric 
medium is the boundary value of a holomorphic 


function 4 (w°) in Z , satisfying the Y relation 
(S), and such that the integral [^ |A (w + in) 一 1|? dw 
is uniformly bounded for all n> 0. 

It follows that all the previous relations are 
satisfied by the function  F(u'*) — 2 (uw) — 1. 
In particular, the real refractive index m(w) and the 
“extinction coefficient” (3(w)=2wK(w)/c (c being 
the velocity of light in the vacuum) are linked by 
the following Kramers-Krónig dispersion relation 
(corresponding to eqn [5a |): 


c B(w) 4, 
ma -1—3-P ag [3 
2. Dispersion relation for tbe forward two-particle 
scattering amplitude in relativistic quantum physics 
One considers the following collision phenomenon in 
particle physics. A particle II with mass m, called the 
target and sitting at rest in the laboratory, is collided 
by an identical particle IT; with relativistic energy w 
larger than m (—71c^; in high-energy physics, one 
usually chooses units such that c— 1). After the 
collision, the particle IIT, is scattered in all possible 
directions, 0, of space, according to a certain 
quantum scattering amplitude To(w), whose modulus 
is essentially the rate of probability for detecting IT, in 
the direction 0. The forward scattering amplitude 
To(w) corresponds to the detection of II; in the 
forward longitudinal direction with respect to its 
incidence direction towards the target. Let us also 
assume that the particles carry no charge of any kind, 
so that each particle coincides with its *antiparticle." 
In that case, To(w) is shown to be the boundary value 
of a scattering function Ty(w)) enjoying the follow- 
ing properties: 
1. it is a holomorphic function in Z, satisfying the 
symmetry relation (S); 
2. its behavior at infinity in Z, is such that the 


integral 


is uniformly bounded for all » > 0; and 

3. under more specific assumptions on the mass 
spectrum of the subjacent theory, the *absorptive 
part" A(w) 三 Im To(w) vanishes for |w|< m. 


2 l''o(w +i) os 
( 


w + in)? 


Then by applying eqn [Sa] to the function D(w) = 
Re[(To(w) — Tp(0)) /u] (regular at w = 0), one obtains 
the following dispersion relation: 


Re To(u) 


* qoc 1 


2 Z 
= To(0) +P | 


m w (w^ I w?) 
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Remark In view of (3), the scattering function 
To(u9) admits an analytic continuation as an even 
function of w® (still called To) in the cut-plane 
C? = CV v € R; |w] > m). In fact, in view of (S) 
and (3), the re value Ty of To satisfies the 
relation To(w)= To(—w) in the real interval 6,, = 
(v € R; -m <w <m}. Let us then introduce the 
function Ty (w))=To(—w)) as a holomorphic 
function of we) in Z_: one sees that the boundary 
values of Ty and T; from the respective domains 
I: and I. coincide on óm and therefore admit a 
common analytic continuation throughout this real 
interval (in view of “Painlevé’s lemma" or “one- 
dimensional edge-of-the-wedge theorem"). One 
also notes that in view of (S) the extended func- 
tion To satisfies the “reality condition” 
Tolu ) = To(u/9) in p, The fact that To is well 
defined as an even holomorphic function in the cut- 
plane C" has been established in the general 
framework of QFT, as explained in the last part of 
this article. 


Phenomena of Linear Response: 
Causality and Dispersion Relations 
in the Classical Domain 


The subsequent axiomatic framework and results 
(due to J S Toll (1952, 1956)) concern any physical 
system which exhibits the following type of phe- 
nomena: whenever it receives some excitation signal, 
called the input and represented by a real-valued 
function of time fin(t) with compact support, the 
system emits a response signal, called the output and 
represented by a corresponding real-valued function 
four(t), in such a way that the following postulates 
are satisfied: 


(P1) Linearity. To every linear combination of 
inputs aifin.1 + 42fin.2, there corresponds the 
output aA out, 1 T 42fout,2- 

(P2) Reproductibility or time-translation invariance. 
Let 7 be a time-translation parameter taking 
arbitrary real values; to every “time-translated 

input" TH) —fi(t— T), there corresponds 
the output ho (t) = four(t — 7). 

(P3) Causality. The effect cannot precede the cause, 
namely if tin and tou denote respectively the 
lower bounds of the supports of fin(t) and 
four(t), then there always holds the inequality 
lin < tout» 

(P4) Continuity of the response. There exists 
some continuity inequality which expresses 
the fact that a certain norm of the output is 
majorized by a corresponding norm of the 
input. The case of an L2-norm inequality of the 
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form |fou|< |fin| is particularly significant: 
when the norm |f| — [f |f( t)|* dt]? is interpre- 
table as an energy (for the output as well as for 
the input), it acquires the meaning of a 
“dissipation” property of the system. 


The postulate of linear dependence (P1) of fout 
with respect to fin is obviously coh e if the 
response is described by any general kernel K(t, 7) 
such that the following formula makes sense: 


十 DO 
fault) = | Kf) 图 
Conversely, the existence of a distribution kernel K 
can be established rigorously under the continuity 
assumption postulated in (P4) by using the Schwartz 
nuclear theorem. In full generality (see our comment 
in the next paragraph), the kernel K(t,7') appears to 
be a tempered distribution in the pair of variables 
(t, t") and the previous integral formula holds in the 
sense of distributions, which means that both sides 
of eqn [8] must be considered as tempered distribu- 
tions (in £) acting on any smooth test-function g(t) in 
the Schwartz space S. (Note, for instance, that the 
trivial linear application fow = fin is represented by 
the kernel K(t, t’) = ó(t — t')). 

From the reproductibility postulate (P2), it fol- 
lows that the distribution K can be identified with a 
distribution of the single variable 7 =t — t’, namely 
K(t,t') -R(t —t'. Moreover, the real-valuedness 
condition imposed to the pairs (fin, four) entails that 
R is real. Finally, the causality postulate (P3) implies 
that the support of the distribution R is contained in 
the positive real axis, so that one can write, in the 
sense of distributions, 


t 
foul) = | RE-Pfa(P P) 
The convolution kernel R(t — t") is typically what 
one calls in physics a “retarded kernel.” 

If we now introduce the frequency variable w, 
which is the conjugate of the time variable t, by the 
Fourier transformation 


" / + f(t) e^ di 


we see that the convolution equation [9] is equiva- 
lent to the following one: 


foulo) m R(w)fin(w) [10] 


In the latter, the Fourier transform R(w) of R is a 
tempered distribution, which is the boundary value 
from the upper half-plane Z, of a holomorphic 


function R®(w®), called the Fourier-Laplace trans- 
form of R. R® is defined for all w® =w + in, with 
n>0, by the following formula in which the 
exponential is a good test-function for the distribu- 
tion R (since exponentially decreasing for t — +00): 


~ 


RO) (wl) = | R(t)e^"* dt [11] 
* 0 


More precisely, the tempered-distribution character of 
R is strictly equivalent to the fact that R is of 
moderate growth both at infinity and near the reals in 
1,, namely that it satisfies a majorization of the 
following form for some real positive numbers p and q: 


(1+ |o +) 
nP 
We thus conclude from eqn [10] that each phenom- 
enon of linear response is represented very simply in 
the frequency variable by the multiplicative operator 
R(w), whose analytic continuation R'€ (9) is called 
the (causal) response function. 


RO (w+ in| C [12] 


A Typical Illustration: The Damped Harmonic 
Oscillator 


We consider the motion x=x(t) of a damped 
harmonic oscillator of mass m submitted to an 
external force F(t). The force is the input (fin = F) and 
the resulting motion is the output, namely fou(t) = 
x(t). All the previous general postulates (P1)-(P4) are 
then satisfied, but this particular model is, of course, 
governed by its dynamical equation 


x" (t) + 29x (t) + wex(t) = 2 [13] 


where wo is the eigenfrequency of the oscillator and 
y is the damping constant (40). The relevant 
solution of this second-order differential equation 
with constant coefficients is readily obtained in 
terms of the Fourier transforms x(w) of x(t) and F(w) 
of F(t). One can in fact replace eqn [13] by the 
equivalent equation 

(=w? — 2iqw + we )&(w) = d [14] 
whose solution is of the form [10], namely x(w) = 
R(w)F(w), with 


TT F(w) 
= ae + 2iyw — ud) 

Hu F(w) 

- mis — wr )(w — us) Heal 
wig = (ad — y) - ig (15b) 


It is clear that the rational function defined by eqns 
[15] admits an analytic continuation in the full 
complex plane of w® minus the pair of simple poles 
(wi,wz) which lie in the lower half-plane. In 
particular, it is holomorphic (and decreasing at 
infinity) in Z,, as expected from the previous general 
result. Moreover, this example suggests that for any 
particular phenomenon of linear response, the details 
of the dynamics are encoded in the singularities of the 
holomorphic scattering function R')(w*)), which all 
lie in the lower half-plane. The validity of a 
dispersion relation only expresses the analyticity 
(and decrease at infinity) of that function in the 
upper half-plane, which is model independent. 


Remark The same mathematical analysis applies to 
any electric oscillatory circuit, in which the capaci- 
tance, inductance, and resistance are involved in 
place of the parameters m, wo and 7: fin and fout 
correspond respectively to an external electric 
potential and to the current induced in the circuit; 
the response function is the admittance of the 
circuit. 


Application to the Kramers-Krónig Relation 


The background of the Kramers-Krónig relation [6], 
namely the analyticity and boundedness properties 
of the complex refractive index £'(w*) in Z,, is 
provided by the previous axiomatic framework. 
However, it is not the quantity 7’(w')) itself but 
appropriate functions of the latter which play the 
role of causal response functions; two phenomena 
can in fact be exhibited, which both contribute to 
proving the relevant properties of 7’(w**’). 


1. Propagation of light in a dielectric slab with 
thickness 6. One considers the wave front fin(t) of an 
incoming wave normally incident upon the slab, 
with Fourier decomposition 


1 i > 
[Pio ne 


r^ 3 m 


fi (t) 


After having traveled through the medium, it gives 
rise to an outgoing wave four(t) on the exit face of 
the slab, whose Fourier decomposition can be 
written as follows (provided the thickness 6 of the 
slab is very small): 


L de ult Ge 
fa 人 = 去 | Fue" dy — p 


In the latter, the real part of m'(w)/c is the inverse of 
the light velocity in the medium, while its imaginary 
part takes into account the exponential damping of 
the wave. The output fow thus appears as a causal 
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linear response with respect to fin (since fout “starts 
after" fin). According to the general formula [10], 
the corresponding response function R® can be 
directly computed from eqns [16] and [17], which 
yields: 


RO (w'°)) - ei^ A (wl )8/c [1 8] 


In view of the previous axiomatic analysis, Rt has 
to be holomorphic and of moderate growth in T4, 
and since this holds for all 6’s sufficiently small, it 
can be shown that the function #’(w)) itself is 
holomorphic and of moderate growth in Z, (no 
logarithmic singularity can be produced). 

2. Polarization of tbe medium produced by an 
electric field. The dielectric polarization signal P(t) 
produced at a point of a medium by an external 
electric field E(t) is also a phenomenon of linear 
response which obeys the postulates (P1)-(P4); the 
corresponding formula [10] reads 


P(w) = x'(w)E(w) [19a] 


where X is the complex dielectric susceptibility of 
the medium, which is related to n’ by Maxwell’s 
relation 


to) d EE [n^ (w) LÍ 1 
X (w) = eee ^ 


One thus recovers the fact that y’ admits an analytic 
continuation in Z , ; one can also show by a physical 
argument that ¥’(w), and thereby z'(w) — 1, tends to 
zero as a constant divided by w when w tends 
to infinity. This behavior at infinity extends to 
A (uw) — 1 in Z, in view of the Phragmen-Lindelóof 
theorem, since 7’ is known (from (1)) to be of 
moderate growth. This justifies the analytic back- 
ground of Kramers-Krónig's relation. 


(19b] 


From Relativistic QFT to the Dispersion 
Relations of Particle Physics: Historical 
Considerations and General Survey 


In the quantum domain, the derivation of dispersion 
relations for the two-particle scattering (or collision) 
amplitudes of particle physics has represented, since 
1956 and throughout the 1960s, an important 
conceptual progress for the theoretical treatment of 
that branch of physics. These phenomena are 
described in a quantum-theoretical framework in 
which the basic kinematical variables are the 
energies and momenta of the particles involved. 
These variables play the role of the frequency of 
light in the optical scattering phenomena. Moreover, 
since large energies and momenta are involved, 
which allow the occurrence of particle creation 
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according to the conservation laws of special relativ- 
ity, it is necessary to use a relativistic quantum- 
mechanical framework. Around 1950, the success of 
the quantum electrodynamics formalism for comput- 
ing the electron—photon, electron-electron, and elec- 
tron-positron scattering amplitudes revealed the 
importance of the concept of relativistic quantum 
field for the understanding of particle physics. 
However, the methods of perturbation theory, 
which had ensured the success of quantum electro- 
dynamics in view of the small value of the coupling 
parameter of that theory (namely the electric charge 
of the electron), were at that time inapplicable to the 
strong nuclear interaction phenomena of high-energy 
physics. This failure motivated an important school 
of mathematical physicists for working out a model- 
independent axiomatic approach of relativistic QFT 
(e.g., Lehmann, Symanzik, Zimmermann (1954), 
Wightman (1956), and Bogoliubov (1960); see Axio- 
matic Quantum Field Theory). Their main purpose 
was to provide a conceptually satisfactory treatment 
of relativistic quantum collisions, at least for the case 
of massive particles. Among various postulates 
expressing the invariance of the theory under the 
Poincaré group in an appropriate quantum- 
mechanical Hilbert-space framework, the approach 
basically includes a certain formulation of the 
principle of causality, called microcausality or local 
commutativity. This axiomatic approach of QFT was 
followed by a conceptually important variant, namely 
the algebraic approach to QFT (Haag, Kastler, Araki 
1960), whose most important developments are 
presented in the book by Haag (1992) (see Algebraic 
Approach to Quantum Field Theory). From the 
historical viewpoint, and in view of the analyticity 
properties that they also generate, one can say that all 
these (closely related) approaches parallel the axio- 
matic approach of linear response phenomena with, 
of course, a much higher degree of complexity. In 
particular, the characterization of scattering (or 
collision) amplitudes in terms of appropriate struc- 
ture functions of the basic quantum fields of the 
theory is a nontrivial preliminary step which was 
taken at an early stage of the theory under the name 
of “asymptotic theory and reduction formulae” 
(Lehmann, Symanzik, Zimmermann 1954-57, 
Haag-Ruelle 1962, Hepp 1965). There again, in the 
field-theoretical axiomatic framework, causality gen- 
erates analyticity through Fourier-Laplace transfor- 
mation, but several complex variables now play the 
role which was played by the complex frequency in 
the axiomatics of linear response phenomena: they 
are obtained by complexifying the relativistic energy- 
momentum variables of the (Fourier transforms of 
the) quantum fields involved in the high-energy 


collision processes. In fact, the holomorphic functions 
which play the role of the causal response function 
R(w) are the QFT structure functions or “Green 
functions in energy-momentum space." The study of 
all possible analyticity properties of these functions 
resulting from the QFT axiomatic framework is 
called the analytic program (see Scattering in 
Relativistic Quantum Field Theory: The Analytic 
Program). The primary basic scope of the latter 
concerns the derivation of analyticity properties for 
the scattering functions of two-particle collision 
processes, which appears to be a genuine challenge 
for the following reason. The basic Einstein relation 
E=mce*, which applies to all the incoming and 
outgoing particles of the collisions, operates as a 
geometrical constraint on the corresponding physical 
energy-momentum vectors: according to the Min- 
kowskian geometry, the latter have to belong to mass 
hyperboloids, which define the so-called “mass shell” 
of the collision considered. It is on the corresponding 
complexified mass-shell manifold that the scattering 
functions are required to be defined as holomorphic 
functions. In the analytic program of QFT, the 
derivation of such analyticity domains and of 
corresponding dispersion relations in the complex 
plane of the squared total energy variable, s, of each 
given collision process then relies on techniques of 
complex geometry in several variables. As a matter of 
fact, the scattering amplitude is a function (or 
distribution) of two variables F(s,t), where t is a 
second important variable, called the squared 
momentum transfer, which plays the role of a fixed 
parameter for the derivation of dispersion relations in 
the variable s. The value t=0 corresponds to the 
special kinematical situation which has been 
described above (for the case of equal-mass particles 
II; and II;) under the name of forward scattering and 
the variable s is a simple affine function of the energy 
w of the colliding particle II; in the laboratory 
Lorentz frame, (namely s = 2m? + 2mw in the equal- 
mass case). It is for the corresponding scattering 
amplitude To(w)= Fo(s) = F(s, 1); . 9 that a dispersion 
relation such as eqn [7] can be derived, although this 
derivation is far from being as simple as for the 
phenomena of linear response in classical physics: 
even in that simplest case, it already necessitates the 
use of analytic completion techniques in several 
complex variables. The first proof of this dispersion 
relation was performed by K Symanzik in 1956. In 
the case of general kinematical situations of measure- 
ments, the direction of observation of the scattered 
particle includes a nonzero angle with the incidence 
direction, which always corresponds to a negative 
value of t. The derivation of dispersion relations at 
fixed t — t; < 0, namely for the scattering amplitude 
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F,,(s) = F(s,t),-., requires further arguments of 
complex geometry, and it is submitted to subtle 
limitations of the form ft; < to < 0, where t; depends 
on the mass spectrum of the particles involved in the 
theory. The first rigorous proof of dispersion relations 
at t < 0 was performed by N N Bogoliubov in 1960. 

Three conceptually important features of the 
dispersion relations in particle physics deserve to 
be pointed out. 


1. In comparison with the dispersion relations of 
classical optics, a feature which appears to be new is 
the so-called “crossing property,” which is character- 
istic of high-energy physics since it relies basically on 
the relativistic kinematics. According to that prop- 
erty, the boundary values of the analytic scattering 
function F,(s) at positive and negative values of s 
from the respective half-planes Ims » 0 and Ims <0 
are interpreted, respectively, as the scattering ampli- 
tudes of two physically different collision processes, 
which are deduced from each other by replacing the 
incident particle by the corresponding antiparticle; 
one also says that *these two collision processes are 
related by crossing." A typical example is provided 
by the proton-proton and proton-antiproton colli- 
sions, whose scattering amplitudes are therefore 
mutually related by the property of analytic con- 
tinuation. This type of relationship between the 
values of the scattering function at positive and 
negative values of s generalizes in a nontrivial way 
the symmetry relation (S) satisfied by the forward 
scattering function To(u*) when each particle coin- 
cides with its antiparticle (see the second basic 
example above). No nontrivial crossing property 
holds in that special case and the fact that Tọ is an 
even function of w® precisely expresses the identity 
of the two-collision processes related by crossing. In 
the general case, for t= 0 as well as for t = to < 0 for 
any value of fo, the analyticity domain that one 
obtains for the scattering function is not the full cut- 
plane of s: in its general form, a “crossing domain" 
may exclude some bounded region B;, from the cut- 
plane, but it always contains an infinite region which 
is the exterior of a circle minus cuts along the two 
infinite parts of the real s-axis (Bros, Epstein, Glaser 
1965): these cuts are along the physical regions of the 
two collision processes related by crossing. In that 
general case, the scattering function F,(s) still satisfies 
what can be called a quasi-dispersion-relation, in 
which the right-hand side contains an additional 
Cauchy integral, taken along the boundary of B,. 

2. A second important feature concerns the 
behavior at large values of s of the scattering 
functions F,(s) in their analyticity domain. As 
indicated in the presentation of the second basic 


example, a  "precise-increase" property was 
expected to be satisfied by the forward scattering 
amplitude To(w) for w (or s) tending to infinity. 
This “precise-increase” property implied the neces- 
sity of writing the corresponding dispersion rela- 
tion [7] for the function (To(w) — To(0))/w: this is 
what one calls a “dispersion relation with a 
subtraction." As a matter of fact, the existence of 
such restrictive bounds on the total cross sections at 
high energies had been discovered in 1961 by 
M Froissart: his derivation relied basically on the 
use of the unitarity of the scattering operator 
(expressing the quantum principle of conservation 
of probabilities), but also on a strong analyticity 
postulate for the scattering function not implied by 
the general field-theoretical approach (namely the 
Mandelstam domain of “double dispersion rela- 
tions"). In the general framework of QFT, Froissart- 
type bounds appeared to be closely linked to a 
further nontrivial extension of the range of “admis- 
sible" values of t for which F,(s) can be analytically 
continued in a cut-plane or crossing domain. In 
fact, the extension of this range to positive (i.e., 
“unphysical”) and even complex values of t, and as 
a second step the proof of Froissart-type bounds in 
s( log s)? for F,(s) at all these admissible values of t, 
were performed in 1966 by A Martin. They rely on 
a subtle conspiracy of the analyticity properties 
deduced from the QFT axiomatic framework and of 
positivity and unitarity properties expressing the 
basic Hilbertian structure of the quantum collision 
theory. The consequence of these bounds on the 
exact form of the dispersion relations is that, as in 
formula [7] of the case t= 0, it is justified to write a 
(the so-called “subtracted”) dispersion relation for 
(F,(s) — F,(0)— sF,(0))/s?: for the general case when 
the crossing property replaces the symmetry (S), 
such a dispersion relation involves two subtractions 
(since F,(0) #0). Detailed information concerning 
the interplay of analyticity and unitarity on the mass 
shell and the derivation of refined forms of disper- 
sion relations and various boundedness properties 
for the scattering functions are given in the book by 
Martin (1969). 

3. Constraints imposed by dispersion relations 
and experimental checks. The conceptual impor- 
tance of dispersion relations incorporating the 
above features (1) and (2) is displayed by such 
spectacular application as the relationship between 
the high-energy behaviors of proton-proton and 
proton-antiproton cross sections. Even though the 
closest forms of relationship between these cross 
sections (e.g., the existence of equal high-energy 
limits) necessitate for their proof some extra 
assumption concerning, for instance, the behavior 
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of the ratio between the dispersive and absorptive 
parts of the forward scattering amplitude, one can 
speak of an actual model-independent implication 
of general QFT that imposes nontrivial constraints 
on phenomena. Otherwise stated, checking experi- 
mentally the previous type of relationship up to the 
limits of high energies imposed by the present 
technology of accelerators constitutes an indirect, 
but important test of the validity of the general 
principles of QFT. 


As a matter of fact, it has also appeared frequently 
in the literature of high-energy physics during the 
last 40 years that the Froissart bound by itself was 
considered as a key criterion to be satisfied by any 
sensible phenomenological model in particle physics. 
As already stated above, the Froissart bound is one 
of the deepest consequences of the analytic program 
of general QFT, since its derivation also incorpo- 
rates in the most subtle way the quantum principle 
of probability conservation. Would it be only for the 
previous basic results, the derivation of dispersion 
relations (and, more generally, the results of the 
analytic program) in QFT appear as an important 
conceptual bridge between a fundamental theoreti- 
cal framework of relativistic quantum physics and 
the phenomenology of high-energy particle physics. 


Basic Concepts and Main Steps in the 
QFT Derivation of Dispersion Relations 


The rest of this article outlines the derivation of the 
analytic background of dispersion relations for the 
forward scattering amplitudes in the framework of 
axiomatic QFT. After a brief introduction on 
relativistic scattering processes and the problematics 
of causality in particle physics, it gives an account of 
the Wightman axioms and the simplest reduction 
formula which relates the forward scattering ampli- 
tude to a retarded product of the field operators. 
Then it describes how the latter can be used for 
justifying a certain type of analyticity domain for the 
forward scattering functions, namely a crossing 
domain or in the best cases a cut-plane in the 
squared energy variable s. This is the basic result 
that allows one to write dispersion relations (or 
quasi-dispersion-relations) at t= 0; the exact form of 
the latter, including at most two subtractions, relies 
on the use of Hilbertian positivity and of the 
unitarity of the scattering operator. 


Relativistic Quantum Scattering as a Phenomenon 
of Linear Response 


Collisions of quantum particles may be seen as 
phenomena of linear response, but in a way which 


differs greatly from what has been previously 
described. 


Particles in Minkowskian geometry Each state of a 
relativistic classical particle with mass m is char- 
acterized by its energy-momentum vector or 
4-momentum fp -(po,p) satisfying the mass-shell 
condition p?-pi —p^-m^ (in units such that 
c — 1). In view of the condition of positivity of the 
energy po >0 the “physical mass shell” thus coin- 
cides with the positive sheet H* of the mass 


m 


hyperboloid H,, with equation p? = nr. 


The set of all energy-momentum configurations 
characterizing the collisions of two relativistic classi- 
cal particles with initial (resp. final) 4-momenta 
Pi,p2 (resp. p4,p5) is the mass-shell manifold M 


一 


defined by the conditions 


2 2 2 
, 5r =m, pap > 0, Pig 0, 121,2 


pit+p2=p)+P2 


where the latter equation expresses the relativistic 
law of total energy-momentum conservation. M is 
an eight-dimensional manifold, invariant under the 
(six-dimensional) Lorentz group: the orbits of this 
group that constitute a foliation of M are parame- 
trized by two variables, namely the squared total 


5 


energy s=(pi +2)’ =(p +p)? and the squared 


" 
p; =m 


momentum transfer t=(p; — p’ )* =(p2 — py (or 
u= (pı — py. =4m?  s—t) In these variables, 
called the Mandelstam variables, the “physical 


region" ® of the collision is represented by the set 
of pairs (s, £) (or triplets (s, £, 4) with s + t + u — 4m?) 
such that t < 0,4 < 0, and therefore s > 4m. 

Correspondingly, each state of a relativistic quan- 
tum particle with mass m is characterized by a wave 
packet f (p) on H}, which is an element of unit norm 
of Lo(Hj;; ups (p)), with us (p) — dp/(p? + m?) ^. In 
Minkowskian spacetime with coordinates x — (xo, x), 
any such state is represented by a wave function f (x) 
whose Fourier transform is the tempered distribution 
(with support in H+) f(p) x é(p? — m^): f(x) is a 
positive-energy solution of the Klein-Gordon equa- 
tion (0? /0x2 — Ax + m^)f (x) — 0. A free two-particle 
state is a symmetric wave packet f (p1, p») on Hj, x 
H- in the Hilbert space L2(H7, x H23 s Q bm). 


n 


Scattering kernels as response kernels: distribution 
character While the input to be considered is a free 
wave packet fin(p1, p2) on H} x H}, representing the 
preparation of an initial two-particle state, the output 
corresponds to the detection of a final two-particle 
state also characterized by a wave packet gu (p. p5) 
on Hj; x H}. In quantum mechanics, linearity is 


linked to the "superposition principle" of states, 


which allows one to state that collisions are described 
by a certain bilinear form (fin, Zour) — S(fin, Zout)s 
called the “scattering matrix.” This bilinear form is 
bicontinuous with respect to the Hilbertian norms of 
the wave packets, and it then results from the 
Schwartz nuclear theorem that it is represented by a 
distribution kernel S(p1,p2;p',p5), namely a tem- 
pered distribution with support contained in M, in 
such a way that (formally) 


S(fin Zour) = /in (pr pies POSU pass) 
X Ios (P 1) Im (D2) tm bi )Hm (p3) [20] 


If there were no interaction, S(fins Zour) Would reduce 
to the Hilbertian scalar product <ĝou,fin> in La 
(H7. x H}; Hm ® Hm) and the corresponding kernel S 
would be the identity kernel 


(pi, poii. bh) = ; [os — pi)é(P2 — p3) 

+ é(pi — p5)6(pa — p1)] 
In the general case, the interaction is therefore 
described by the scattering kernel T(p1, p2;p',p5) = 
S(P1, P23 A P5) — T(pi, P23 P1, D5). The action of T as 
a bilinear form (defined in the same way as the 
action of $ in eqn [20]) may be seen as the quantum 
analog of the classical response formula [10]. Note, 
however, the difference in the mathematical treat- 
ment of the output: instead of being considered as 
the direct response (fout) to the input, it is now 
explored by Hilbertian duality in terms of detection 
wave packets Zour, in conformity with the principles 
of quantum theory. Finally, in view of the invariance 
of the collision process under the Lorentz group, the 
scattering kernel T is constant along the orbits of 
this group in M and it then defines a distribution 
F(s, t)  T(p1, pos p4, p5) with support in the physical 
region TP: this is what is called the scattering 
amplitude. 


What becomes of causality? One can show that the 
positive-energy solutions of the Klein-Gordon equa- 
tion cannot vanish in any open set of Minkowski 
spacetime; they necessarily spread out in the whole 
spacetime. This makes it impossible to formulate a 
causality condition comparable to eqn [9] in terms 
of the spacetime wave functions f, and Zout 
corresponding to the input and output wave packets 
fins Zoure In this connection, it is, however, appro- 
priate to note that (after various attempts of “weak 
causality conditions”) a certain condition called 
“macrocausality” (lagolnitzer and Stapp 1969; see 
the book by Iagolnitzer (1992)) has been shown to 
be equivalent to some local properties of analyticity 
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of the scattering kernel T; but it is not our purpose 
to develop that point here for two reasons: (1) the 
interpretation of that condition is rather involved, 
because it integrates a very weak form of causality 
together with the spatial short-range character of the 
strong nuclear interactions between the elementary 
particles; (2) the domains of analyticity obtained are 
by far too small with respect to those necessary for 
writing dispersion relations. The reason for this 
failure is that the scattering kernel only represents 
an asymptotic quantum observable, in the sense that 
it is intended to describe observations far apart from 
the extremely small spacetime region where the 
particles strongly interact, namely in regions where 
this interaction is asymptotically small. Although 
well adapted to what is actually observed in the 
detection experiments, the concept of scattering 
kernel is not sufficient for describing the funda- 
mental interactions of physics: it must be enriched 
by other theoretical concepts which might explicitly 
take into account the microscopic interactions in 
spacetime. This motivates the introduction of quan- 
tum fields as basic quantities in particle physics. 


Relativistic Quantum Fields: Microcausality and 
the Retarded and Advanced Kernels; Analyticity 
in Complex Energy-Momentum Space 


By an idealization of the concept of quantum 
electromagnetic field and a generalization to all 
types of microscopic interactions of matter, one 
considers that all the phenomena involving such 
interactions can be described by fields ;(x), whose 
amplitude can, in principle, be measured in arbi- 
trarily small regions of Minkowski spacetime. In the 
quantum framework, one is thus led to the notion of 
local observable O (emphasized as a basic concept in 
the axiomatic approach of Araki, Haag, and 
Kastler). In the Wightman field-theoretical frame- 
work, a local observable corresponds to the measur- 
ing process of a ponderated average of a field ®;(x) 
of the form O= $;[f] = [®;(x)f(x) dx. In the latter, 
f(x) denotes a smooth real-valued test-function with 
(arbitrary) compact support K in spacetime; the 
observable Ó is then said to be localized in K. Each 
observable O-«4;(f) has to be a self-adjoint 
(unbounded) operator acting in (a dense domain 
of) the Hilbert space H generated by all the states of 
the system of fundamental fields {®;}; therefore, the 
correct mathematical concept of relativistic quantum 
field ®(x) is an “operator-valued tempered distribu- 
tion on Minkowski spacetime.” Here the additional 
"temperateness assumption” is a convenient techni- 
cal assumption which in particular allows the 
passage to the energy-momentum space by making 
use of the Fourier transformation. 
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In this QFT framework, it is natural to express a 
certain form of causality by assuming that two 
observables ®(f) and (f) commute if the sup- 
ports of f and f’ are spacelike-separated regions in 
spacetime, which means that no signal with 
velocity smaller or equal to the velocity of light 
can propagate from either one of these regions to 
the other. This expresses the idea that these two 
observables should be independent, that is, *com- 
patible as quantum observables." This postulate is 
equivalent to the following condition, called 
microcausality or local commutativity, and under- 
stood in the sense of operator-valued tempered 
distributions: 

[B(x1), $(x,) 2 0, for (x; — x1)! « 0 [21] 
where (x =x) 
pseudonorm of 


is the squared Minkowskian 
x =x, — %4 = (x0,%), namely 
x^ = x9 一 x^. It follows that for every admissible 
pair of states V, V' in H, the tempered distribution 
Cy (x1, x1) = «V, [B (x1); 9(x1)]| V > [22] 
has its support c contained in the union of the sets 
V*:ixi—-x, € Vt and V :x1 — x, € V^, where V+ 
and V- are, respectively, the closures of the forward 
and backward cones V+ = {x = (xo, x); xo > |x|}, 
V-——V* in Minkowski spacetime. It is always 
possible to decompose the previous distribution as 
= A wy y (x1 j x ) [23] 


Cu (x1, x1) = Rg w (x1, x1) 


in such a way that the supports of the distributions 
Ry,w(x1,x,) and Ay,w(x1, x3) belong, respectively, 
to Y* and V and Ay w are called, 
respectively, retarded and advanced kernels and 
they are often formally expressed (for convenience) 
as follows: 


Ry w 


Ry w (x1, x1) = (x1, -x o) Cu, w(x, x^) 


Aww (x1, x1) = —A(x1,0 — x19)Cw,v (x1, x1) 

in terms of the Heaviside step function 0(t) of the 
time-coordinate difference 7 —x1,0 — x, y. For every 
pair (V, V^, Ry, w(x1,x1) appears as a relativistic 
generalization of the retarded kernel R(t — 7") of eqn 
[10]: its support property in spacetime, similar to the 
support property of R in time, expresses a relativistic 
form of causality, or “Einstein causality.” 

There exists a several-variable extension of the 
theory of Fourier-Laplace transforms of tempered 
distributions which is based on a formula similar to 
eqn [11]. We introduce the vector variables 
X= (x1 +%})/2,x=x1-—x, and a complex 
4-momentum k=p+ig=(ko,k) as the conjugate 


vector variable of x with respect to the Minkows- 
kian scalar product k = koxo — k-x, and we define 


Rp (k,X)= | Ryw (X+5.X- 


Since 4 x0 for all pairs (ga) such that q € V*, 
x€ V, it follows that RS oh, X) is holomorphic 
with respeit to k in the domain 7 containing all 
k—p--iq such that q belongs to V+. Moreover, in 
the limit q—0 this holomorphic function tends (in 
the sense of distributions) to the Fourier transform 
Ry (p, X) of Ry.w(X+x/2,X —x/2) with respect 
to x. The domain 7°, which is called the “forward 
tube,” is the analog of the domain Z , of the w-plane; 
bounds of moderate-type comparable to those of | 12] 
apply to the holomorphic function Bey: nu. 
Similarly, the advanced kernel Ay, (X -x/2, X — 
x/2) admits a Fourier-Laplace transform AT, s SX), 
which is holomorphic and of moderate growth in the 
“backward tube" 7 containing all &—p--1q such 
that q belongs to V^. In view of [23], the Fourier 
transform Cy y(p,X) of Cy w(X+x/2,X —x/2) 
then appears as the ditterence between the boundary 
values of Ry. p, and AY y, on the reals (from the 
respective domains T* di T. 


=) es*dx [24] 


The Field-Theoretical Axiomatic Framework and 
the Passage from the Structure Functions of QFT 
to the Scattering Kernels (Case of Forward 
Scattering) 


The postulates (Wightman axioms) Apart from the 
causality postulate, which we have already presented 
above in view of its distinguished role for generating 
analyticity properties in complex energy-momentum 
space, the field-theoretical axiomatic approach to 
collision theory is based on the following postulates 
(for all the fundamental developments of axiomatic 
field theory, the interested reader may consult the 
books by Streater and Wightman (1980) and by Jost 
(1965); see Axiomatic Quantum Field Theory). 


1. There exists a unitary representation g—U(g) of 
the Poincaré group G in the Hilbert space of 
states H; in this representation, the abelian 
subgroup of translations of space and time has a 
Lie algebra whose generators are interpreted as 
the four self-adjoint (commuting) operators P,, of 
total energy-momentum of the system. 

2. The quantum field operators (x) transform 
covariantly under that representation; in the 
simplest case of scalar fields (considered here), 
(gx) = U(g)b(x)U(g !). 

3. There exists a unique state Q, called the vacuum, 
such that the action of all polynomials of field 
operators on Q generates a dense subset of H; 


moreover, Q is assumed to be invariant under the 
representation U of G, and thereby such that 
P0. 

4. Spectral condition or positivity of energy in all 
physical states. The joint spectrum X of the 
operators P, is contained in the closed forward 
cone V* of energy-momentum space. In order to 
perform the collision theory of massive particles, 
one needs a more detailed *mass-gap assump- 
tion": X is the union of the origin O, of one or 
several positive sheets of hyperboloid H}, and 
of a region V defined by the conditions p? > 
M*, po > 0, with M larger than all the m;. 


The Hilbert space H is correspondingly decom- 
posed as the direct sum of the vacuum subspace (or 
zero-particle subspace) generated by €), of subspaces 
of stable one-particle states with masses m; iso- 
morphic to L2(H;,,j,,,), and of a remaining sub- 
space H’. As a result of the construction of 
“asymptotic states," H’ can be shown to contain 
two subspaces Tti and H'a generated, respectively, 
by N-particle incoming states (with N arbitrary and 
>2) and by N-particle outgoing states. The collision 
operator $ is then defined as the partially isometric 
operator from H'a onto H’,, which maps a 
reference basis of outgoing states onto the corre- 


sponding basis of incoming states. 


An independent postulate: asymptotic completeness 
(see Scattering, Asymptotic Completeness and 
Bound States and Scattering in Relativistic 
Quantum Field Theory: Fundamental Concepts 
and Tools) The theory is said to satisfy the 
property of asymptotic completeness if all the states 
of H can be interpreted as superpositions of various 
N-particle states (either in the incoming or in the 
outgoing state basis) namely if one has 
H' =H =H) a This property is not implied by the 
previous postulates on quantum fields, but its 
physical interpretation and its role in the analytic 
program are of primary importance (see Scattering 
in Relativistic Quantum Field Theory: The Analytic 
Program). Let us simply note here that asymptotic 
completeness implies as a by-product the unitarity 
property of the collision operator $ on the full 
Hilbert space H (i.e., SS* — S*S— I). 


Connection between retarded kernel and scattering 
kernel for the forward scattering case; a simple 
"reduction formula" We consider the scattering of 
a particle II; with mass m on a target consisting of 
a particle Il; with mass mm and denote by 
T(p1, pos p^, p5) the corresponding scattering kernel 
(defined similarly as for the case of equal-mass 
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particles considered earlier). Equations [22 ]-[24] are 
then applied to the case when W and W’ coincide 
with a one-particle state of II? at rest, namely with 
4-momentum f»-—fp5 along the time axis: p= 
((p2)0, 0), (p2)o — m2. This describes in a simple way 
the case of forward scattering, since in view of the 
energy-momentum conservation law pı + p? — pi + 
ph, the choice p2=p5 also implies that p; —p. 
(The possibility of restricting the distribution 
T(pi1, pos p. p5) to such fixed values of the energy- 
momenta is shown to be mathematically well 
justified). The advantage of this simple case is that 
the corresponding kernels [22], [23] of (x1, x4) are 
invariant under spacetime translations and therefore 
depend only on x (and not on X). We can thus 
rewrite eqns [22], [23] with simplified notations as 
follows: 


ents) <r 0G). e] 
= Rp, (x) — Ap, (x) [25] 


which can be shown to give correspondingly by 
Fourier transformation 


Cp, (p) =<p2, ®(p)®(—p)p2> 一 <p2, ®(—p)®(p)p2> 
= Rp, (p) — Ap, (p) [26] 


If the particle II; appears in the asymptotic states of 
the field ®, the scattering kernel T(p1,p2;p),p) is 
then given in the forward configurations pı =p) € 
Hi ,pi—p,€H;,, by the following reduction for- 


mula in which s= (pi + p2)*: 


Fo(s) = T(pi.po:pi. P2) 
Et (Py "i my) Rp, (P1)] uu: [27] 


m l 


Analyticity Domains in Energy-Momentum Space: 
From the “Primitive Off-Shell Domains" of QFT to 
the Crossing Manifolds on the Mass Shell 


For simplicity, we shall restrict ourselves to the 
consideration of forward scattering amplitudes, 
namely to the derivation of crossing analyticity 
domains and (quasi-)dispersion relations at t=0 
for two-particle collision processes of the form II, 4- 
II; —^ II; + ID, I and Il; being given massive 
particles with arbitrary spins and charges. 


The holomorphic function H,,(k) and its primitive 
domain D. Nontriviality of dispersion relations for 
the scattering amplitudes As suggested by eqn [24], 
we can exploit the analyticity properties of the 
Fourier-Laplace transforms of the retarded and 
advanced kernels Rp, and A,,: in fact, Ry,(p) and 
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Ay. (p) are, respectively, the boundary values of the 
holomorphic functions 


RO (k) = È _ Rp, (x) e ** dx 


28] 
AT (k) = [ Ap, (x) e^* dx 
from the corresponding domains 7^ and T . Accord- 
ing to the reduction formula [27], it is appropriate to 
consider correspondingly the functions Hy. (k) 三 (及 “一 
m?)R(k) and Hy, (k) =(k? — m)À,, (k), which are 
also, seepectively; Boleamshiz i in 7T T and 7 . Then 
the forward scattering amplitude Fo(s) = F(s,0) — 
T(p1,p2;p1,p2) appears as the restriction to the 
hyperboloid sheet p € Hj, of the boundary value 
Hi (p) of Hj. (k) on the reals. 
Moreover, it can be seen that the two ware 
values H+ (p) — (p? — mi)Ry,(p) and H;(p) — (p^— 


m*)Ap, (p) ‘coincide as distributions i in the region 


R = (p € RÍ; (p + po)? < (mi +m)’: 
(p — pa). < (mi 4- ma) ) [29] 


This follows from the intermediate expression in 
ada [26] and from the fact that a state of the form 
(p> —m +) (+p)p2 > is a state of energy-momentum 
+p+p2 and therefore vanishes (in view of the 
spectral condition) if (+p + p2) < (mı + ma) (here 
we also use a simplifying assumption according to 
which no one-particle bound state is present in this 
channel). 

The situation obtained concerning the holo- 
morphic functions H; (Rk) and Hj, ,(k) parallels (in 
complex dimension four) the case of a pair of 
holomorphic functions in the upper and lower half- 
planes whose boundary values on the reals coincide 
on a certain interval playing the role of R. As in this 
one-dimensional case there is a theorem, called the 

*edge-of-the-wedge theorem" (see below), which 
implies that Hs (k) and H, (k) have a common 
analytic continwation H,,(k): ‘this function is holo- 
morphic in a domain D which is the union of 
T*,T and of a complex neighborhood of R; D is 
called the primitive domain of H,, (&). 

Moreover, it follows from the postulate of invar- 
iance of the field (x) under the action of the Poincaré 
group (see postulate (2)) that the holomorphic func- 
tion Hp, (k) only depends of the two complex variables 
C=k?(=ki — k^) and k- po or equivalently s=(k + 
pr)? =C + m$ + 2k - pz; it thus defines a correspond- 
ing holomorphic function H»,(¢,s)=Hp,(k) in the 
image of D in these variables. 

In view of the reduction formula [27], the 
scattering function Fo(s) should appear as the 


restriction of the holomorphic function Hy, (C, s) to 
the physical mass-shell value ¢=mj. However, it 
turns out that the section of D by the complex mass- 
shell manifold M® with equation k? — m? is empty: 
this geometrical fact is responsible for the nontrivi- 
ality of the proof of dispersion relations for 
the physical quantity Fo(s) on the mass shell. In 
fact, the tube Tt U T^ which constitutes the basic 
part of the domain D and is given by the field- 
theoretical microcausality postulate, is a *purely off- 
shell" complex domain, as it can be easily checked: if a 
complex point k—p--iq is such that q^» 0, the 
corresponding squared mass Ç = k? = p? — q? + 2ip -q 
is real if and only if p. 4—0, which implies p? < 0 
(i.e., p spacelike) and therefore ¢ — p? — q? <0. 


“Off-shell dispersion relations" as a first step The 
starting point, which is easy to obtain from the 
domain D, is the analyticity of the holomorphic 
function H,, (C, s) in a cut-plane of the variable s for 
all negative values of the squared mass variable €. 
This cut-plane A; is always the complement in C 
(i.e., the complex s-plane) of the union of the s-cut 
(s real >(m, -+mp)*) and of the u-cut (u=2¢ + 
2m3 — s real >(m + m3)^). This analyticity property 
thus justifies *off-shell dispersion relations" at fixed 
negative values of ¢ for the field-theoretical structure 
function Hy, (G, s). 

The latter property and the subsequent analysis 
concerning the process of analytic continuation of H h 
to positive values of ¢ will be more easily understood 
geometrically if one reduces the complex space of k to 
a two-dimensional complex space, which is legitimate 
in view of the equality Hp, (k) = Hp, (G, s). 

Having chosen the ko-axis along p2, we reduce the 
orthogonal space coordinates k of k to the radial 
variable k,. One thus gets the following expressions 
of the variables C and s (resp. u): 


C—kj-k s=Ç+ m +2mhko 
(resp. u = C + m5 — 2mko) 


Then we can write H,,(¢,$) = Hy, (Eo, k,) = 
Hp (ko, —kr), and describe the image D, of the 
domain D in the variables k=(ko,k,)=p+ig as 
T; UT, UN(R,), where: 


1. T* is defined by the condition q^ =q% — q? > 0, 
qo » 0 or qo < 0, 

2. N is a complex neighborhood of the real region 
R, defined as follows. Let b;,b, be the two 


su 
branches of hyperbolae with respective equations: 


(mi +m), po+m2>0 


bi : (Po +m) — pi 一 


b, : (po—m2) — p? = (mi +m)”, po —m2 <0 


Then R, is the intersection of the region situated 
below 5? and of the region situated above /,,. 


Let us now consider any complex hyperbola 
b'?[C] with equation k*=k? — k? — 6. On such a 
complex curve either one of the variables ko or s or 
u is a good parameter for holomorphic functions 
which are even in k,, like Hy, (Ro, kr). If Ç is real, any 
complex point k=p +iq of b/?[C] is such that p? 
and q? have opposite signs (since p: q— 0). There- 
fore, the sign of q^ is always opposite to the sign of 
C(— p* — q’): if C is negative, all the complex points 
of b'?9|C] thus belong to T} U T, ; the union of all 
these points with the real points of 5'?[C] in R, is 
therefore a subset of D,, which is represented in the 
complex plane of s by the cut-plane Aç. The function 
H,,(¢,s) is therefore analytic (and univalent) in A; 
for each C < 0. Moreover, the existence of moderate 
bounds of type [12] on H,, in D (resulting from the 
temperateness assumption) then implies the validity 
of dispersion relations (with subtractions) for 
Hy, (C, s) in Ac. 


The problem of analytic completion to the complex 
mass-shell hyperbola 5/?9[;]|: what is provided by 
the Jost-Lehmann-Dyson domain A basic fact in 
complex geometry in n variables, with » > 2, is the 
existence of a distinguished class of domains, called 
holomorphy domains: for each domain U in this 
class, there exists at least one function which is 
holomorphic in 2 and cannot be analytically 
continued at any point of the boundary of U. In 
one dimension, every domain is a holomorphy 
domain. In dimension larger than one, a general 
domain U is not a holomorphy domain, but it 
admits a holomorphy envelope 4, which is a 
holomorphy domain containing U, such that every 
function holomorphic in WU admits an analytic 
continuation in 4. 

It turns out that the domain D, considered above 
in the last subsection) is not a holomorphy domain; 
its holomorphy envelope D, (obtained geometrically 
by Bros, Messiah, and Stora in 1961) coincides with 
a domain introduced by Jost-Lehmann (1957) and 
Dyson (1958) by methods of wave equations. This 
domain can be characterized as the union of D, with 
all the complex points of all the hyperbolae with 
equations (ko 一 ay —(k, — bY' = œ (for all a,b,c 
real, including the complex straight lines for which 
c—0) whose both branches have a nonempty 
intersection with the real region R,. 

In particular, one easily sees that all the hyperbo- 
lae hC] with 0 € ¢ < mi belong to the previous 
class. It follows that for any ¢ in this positive 
interval, the function Hp,(¢,s) can still be 
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analytically continued as a holomorphic function of 
s in the cut-plane A; and thereby satisfies the 
corresponding dispersion relations. 

The physical mass shell hyperbola h [m7] thus 
appears as a limiting case of the previous family (for 
C tending to m{ from below). The analyticity of 


^ 


H,, (mi, s) in A, can then be justified provided one 
knows that this function is analytic at at least one 
point of A,2: but this additional information results 
from a more thorough exploitation of the analyticity 
pfoperties resulting from the QFT postulates. This 


will be now briefly outlined below. 


Further information coming from the four-point 
function in complex momentum space It is 
possible to obtain further analyticity properties of 
Hy,(G, s) - Hy,(k) by considering the latter as 
the restriction to the submanifold kı =—k3=k; 
k;— —k4—p; of a master analytic function 
Ha(ki, kb», ks, k4), called the four-point function of 
the field ® in complex energy-momentum space (see 
Scattering in Relativistic Quantum Field Theory: 
The Analytic Program). This function is holo- 
morphic in a well-defined primitive domain D4 of 
the linear submanifold Ri + ko + k3 + b4 — O. It is 
then possible to compute some local parts situated 
near the reals of the holomorphy envelope of D4, 
which implies, as a by-product, that the function 
H,,(¢,s) can be analytically continued in a set X of 
the form 


X-l(6.s)5 CE 6, SE Ve, (C)} 
U{(C,s); C€6, w= 2C+2m5—sEV,,(C)} [30] 


with the following specifications: 


1. 6 is a domain in the ¢-plane, which is a complex 
neighborhood of a real interval of the form 
—a < C « M*; here M, denotes a spectral mass 
threshold in the theory such that M, > m; 

2. for each GV(C) (resp. Vy,(¢)) is a cut- 
neighborhood in the s-plane of the real half-line 
ss, (resp. of the half-line u=2¢ + 2mj— 
s real >u1); s; and mu, denote appropriate real 
numbers independent of C. 


The final analytic completion: crossing domains on 
b'9[mi|. Dispersion relations for ro-ro meson 
scattering and  "quasi-dispersion-relations" for 
proton-proton scattering We now wish to describe 
briefly the final step of analytic completion, which 
displays the existence of a “quasi-cut-plane domain” 
in s for the function H,,(m{,s), even in the more 
general case when the s-cut and u-cut are associated 
with different scattering channels, whose respective 


mass thresholds s=M?, and u= M7, are unequal. 
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This general situation may occur as soon as one 
charged particle II; of the s-channel is replaced by 
the corresponding antiparticle I; in the z-channel, 
in contrast with the case of neutral particles (like the 
79 meson) which coincide with their own antiparti- 
cles. Here it is important to note that the two real 
branches h* [m+] and b [mi] of the mass shell 
hyperbola 5'9[m?]| correspond, respectively, to the 
physical region of the “direct scattering channel” of 
the reaction II, + II; — II; + IT; with squared total 
energy s, and to the physical region of the “crossed 
scattering channel" of the reaction II, + II; — II, + 
Il; with squared total energy u. A typical and 
important example is the case of proton-proton 
scattering in the s-channel, where M12 equals twice 
the mass m(—7h; —75) of the proton, while the 
corresponding z-channel refers to the proton-anti- 
proton scattering, whose threshold M^, equals twice 
the mass jz of the m meson. 

In that general case, the analysis of the subsection 
"*Off-shell dispersion relations’ as a first step” still 
applies, so that the function Hp (¢,s) is always 
analytic in a set of the form 


So = {(¢,s); —a < Ç <0, s E€ Ac} [31] 


Then, the additional information described above in 
the last subsection allows one to use the following 
crucial property of analytic completion, which we 
call 


Crossing lemma Ifa function G(¢,s) is holomorphic 
in a domain which contains the union of the sets X. 
and Sg (see eqns [30] and [31]), then it admits an 
analytic continuation in a set of the following form: 


{(¢, s); s € Ô, SE Ac; 
Is —¢ —m3| = |u—¢ —m}| > R(Q)} 


By applying this property to the function Hy (Gs 
and restricting Ç to the mass-shell value m1 which 
belongs to ó, one obtains the analyticity of the 
scattering fioi Fols) = Hp, (m1, s) in a crossing 
domain of the complex mass shell hyperbola 
himi]: the crossing between the two physical 
regions ht [m1] (s > Mj) and b [mi] (u > MÈ) is 
ensured by a complex domain of h) [m7] whose image 
in the s-plane is the “cut-neighborhood of infinity” 
{s;s € A,2, |s — mi — m3| = |u — mi — m3| > R(m?)}. 
Note that the relevant boundary values of Fo for 
obtaining the scattering amplitudes of the two 
collision processes with respective physical regions 
b*[mi| and h-[m7] have to be taken from the 
respective sides Im s > 0 and Im u = —Ims > 0 of the 
corresponding s- and u-cuts. 


It is only for the neutral case, where Mj2= 

42 =, +m, that a more favorable scenario 
occurs, as explained earlier: in this case, the interval 
{C€]-—a,O[} of the set So is replaced by 
(Ç € ]—a,m7[}, so that the whole cut- -plane domain 
Am is obtained in the result of the previous crossing 
m The scattering amplitudes of 7-79 meson 
scattering and of m meson-proton scattering enjoy 
this property and, therefore, satisfy genuine disper- 
sion relations in which the scattering function is 
even (see the second basic example described at the 
beginning of this article). In the general case of 
crossing domains obtained above, corresponding 
Cauchy integral relations have been written and 
used under the name of “quasi-dispersion-relations.” 


Complementary results Some comments can now 
be added concerning the passage from the purely 
geometrical results (i.e., analyticity domains) 
described above to the writing of precise (quasi-) 
dispersion relations with two subtractions: 

Polynomial bounds and dispersion relations with 
N subtractions The previous methods of analytic 
completion also allow one to control the bounds at 
infinity in the relevant complex domains. As it has 
been noticed after eqn [24], the Fourier-Laplace 
transforms of the retarded and advanced kernels, and 
thereby the holomorphic functions n. (k) discussed 
at the start of this section are bormikan at most by a 
power of a suitable norm of k in their respective tubes 
T*. Correspondingly, the holomorphic function 
Hp, (k) (resp. Hp, (ko, kr)) admits the same type of 
bound in its primitive analyticity domain D (resp. 
D,). These bounds are a consequence of the tempered 
distribution character of the structure functions of the 
fields which is built-in in the Wightman field- 
theoretical framework. Then it can be checked that 
in the holomorphy envelope D, of D,, and thereby in 
the cut-plane (or crossing) domains obtained in the 
intersection of D, and of the complex mass shell 
hb [mi], the same type of power bound is still valid: 
Fo(s) is therefore bounded by some power |s|~ ! of |s| 
and thus satisfies a (quasi-)dispersion relation with N 
subtractions. The same type of argument holds for all 
the similar cut-domains (or crossing domains) in s 
obtained for F;(s) for all negative value of t. 

It is also worthwhile to mention that a similar 
remarkable (since not at all predictable) result was 
also obtained in the Haag, Kastler, and Araki frame- 
work of algebraic QFT (Epstein, Glaser, Martin, 
1969; see Scattering in Relativistic Quantum Field 
Theory: The Analytic Program for further comments). 

In this connection, one can also mention a more 
recent result. In the Buchholz-Fredenhagen axio- 
matic approach of charged fields (1982), in which 
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locality is replaced by the more general notion of 
“stringlike locality” (see Algebraic Approach to 
Quantum Field Theory, Axiomatic Quantum Field 
Theory, and Scattering in Relativistic Quantum 
Field Theory: Fundamental Concepts and Tools), a 
proof of forward dispersion relations has again been 
obtained (Bros, Epstein, 1994). 

The extension of the analyticity domains by 
positivity and the derivation of bounds by unitarity 
(Martin 1966; see the book by Martin (1969)). The 
following ingredients have been used: 


1. Positivity conditions on the absorptive part of 
F(s,t), which are expressed by the infinite set of 
inequalities (d/dt)"Im F(s,t),.9 >0 (for all inte- 
gers 71), 

2. The existence of a two-dimensional complex 
neighborhood of some point (s — s9,2 — 0) in the 
analyticity domain resulting from QFT. 


The following results have then been obtained: 


(a) It is justified to differentiate the forward (sub- 
tracted) dispersion relations with respect to t at 
any order. 

(b) F(s,t) can be analytically continued in a fixed 
circle |t| < tmax for all values of s. The latter 
implies the extension of dispersion relations in s 
to positive (and complex) values of f. 

(c) In a last step, the use of unitarity conditions for 
the “partial waves" f;(s) of F(s,t) (see Scattering 
in Relativistic Quantum Field Theory: The 
Analytic Program) allows one to obtain 
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Introduction 


A dynamical system (DS) is a system which evolves 
with respect to the time. To be more precise, a DS 
(S(t), ®) is determined by a phase space ® which 
consists of all possible values of the parameters 
describing the state of the system and an evolution 
map S(t): ® — @® that allows one to find the state of 
the system at time £ > 0 if the initial state at t=0 is 
known. Very often, in mechanics and physics, the 
evolution of the system is governed by systems of 


Froissart-type bounds on the scattering ampli- 
tudes and thereby to justify the writing of 
(quasi-)dispersion relations with at most two 
subtractions for all the admissible values of t. 


See also: Algebraic Approach to Quantum Field Theory; 
Axiomatic Quantum Field Theory; Perturbation Theory 
and its Techniques; Scattering in Relativistic Quantum 
Field Theory: The Analytic Program; Scattering, 
Asymptotic Completeness and Bound States; Scattering 
in Relativistic Quantum Field Theory: Fundamental 
Concepts and Tools. 


Further Reading 


Haag R (1992) Local Quantum Physics.Berlin: Springer. 

lagolnitzer D (1992) Scattering in Quantum Field Theories: The 
Axiomatic and Constructive Approacbes. Princeton Series in 
Physics. Princeton: Princeton University Press. 

Jost R (1965) The General Theory of Quantized Fields. AMS, 
Providence, RI: American Mathematical Society. 

Klein L (ed. (1961) Dispersion Relations and tbe Abstract 
Approach to Field Theory. New York: Gordon and Breach. 
Martin A (1969) Scattering Theory: Unitarity, Analyticity and 

Crossing, Lecture Notes in Physics. Berlin: Springer. 
Nussenzweig HM (1972) Causality and Dispersion Relations. 
New York: Academic Press. 
Streater RF and Wightman AS (1964, 1980) PCT, Spin and 
Statistics, and all that. Princeton: Princeton University Press. 
Vernov YuS (1996) Dispersion Relations in tbe Historical Aspect, 
IHEP Publications, Protvino Conf.: Fundamental Problems of 
High Energy Physics and Field Theory. 


of Infinite Dimension 


differential equations. If the system is described by 
ordinary differential equations (ODEs), 


d 
qi) = F(t, y(t)), y(0) = yo, 
y(t) = (y1(£), ..., yn (t)) [1] 


for some nonlinear function F:R, x R — RY, we 
have a so-called finite-dimensional DS. In that case, 
the phase space ® is some (invariant) subset of R^ 
and the evolution operator S(t) is defined by 


S(t)yo := y(t), y(t) solves [1] [2] 


We also recall that, in the case where eqn [1] is 
autonomous (i.e., does not depend explicitly on the 
time), the evolution operators S(t) generate a 
semigroup on the phase space ®, that is, 


S(t, + ty) = S(t) o S(t2), ty,t2 E Ry [3] 
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Now, in the case of a distributed system whose 
initial state is described by functions up = uo(x) 
depending on the spatial variable x, the evolution 
is usually governed by partial differential equations 
(PDEs) and the corresponding phase space ® is some 
infinite-dimensional function space (e.g., ® :— L^(Q) 
or ®:= L*(Q) for some domain 2 c R^.) Such DSs 
are usually called infinite dimensional. 

The qualitative study of DSs of finite dimensions 
goes back to the beginning of the twentieth century, 
with the pioneering works of Poincaré on the N- 
body problem (one should also acknowledge the 
contributions of Lyapunov on the stability and of 
Birkhoff on the minimal sets and the ergodic 
theorem). One of the most surprising and significant 
facts discovered at the very beginning of the theory 
is that even relatively simple equations can generate 
very complicated chaotic behaviors. Moreover, these 
types of systems are extremely sensitive to initial 
conditions (the trajectories with close but different 
initial data diverge exponentially). Thus, in spite of 
the deterministic nature of the system (we recall that 
it is generated by a system of ODEs, for which we 
usually have the unique solvability theorem), its 
temporal evolution is unpredictable on timescales 
larger than some critical time To (which depends 
obviously on the error of approximation and on the 
rate of divergence of close trajectories) and can 
show typical stochastic behaviors. To the best of our 
knowledge, one of the first ODEs for which such 
types of behaviors were established is the physical 
pendulum parametrically perturbed by time-periodic 
external forces, 


y" (t) + sin(y(t))(1 + esin(wt)) = 0 [4] 


where w and s > 0 are physical parameters. We also 
mention the more recent (and more relevant for our 
topic) famous example of the Lorenz system which is 
defined by the following system of ODEs in R*: 


x = o(y.— x) 
y =—xy+rx-—y [5] 
z = xy — bz 


where o,r, and b are some parameters. These 
equations are obtained by truncation of the 
Navier-Stokes equations and give an approximate 
description of a horizontal fluid layer heated from 
below. The warmer fluid formed at the bottom 
tends to rise, creating convection currents. This is 
similar to what happens in the Earth’s atmosphere. 
For a sufficiently intense heating, the time evolution 
has a sensitive dependence on the initial conditions, 
thus representing a very irregular and chaotic 


convection. This fact was used by Lorenz to justify 
the so-called “butterfly effect,” a metaphor for the 
imprecision of weather forecast. 

The theory of DSs in finite dimensions had been 
extensively developed during the twentieth century, 
due to the efforts of many famous mathematicians 
(such as Anosov, Arnold, LaSalle, Sinai, Smale, etc.) 
and, nowadays, much is known on the chaotic 
behaviors in such systems, at least in low dimen- 
sions. In particular, it is known that, very often, the 
trajectories of a chaotic system are localized, up to a 
transient process, in some subset of the phase space 
having a very complicated fractal geometric struc- 
ture (e.g., locally homeomorphic to the Cartesian 
product of R" and some Cantor set) which, thus, 
accumulates the nontrivial dynamics of the system 
(the so-called strange attractor). The chaotic 
dynamics on such sets are usually described by 
symbolic dynamics generated by Bernoulli shifts on 
the space of sequences. We also note that, nowa- 
days, a mathematician has a large amount of 
different concepts and methods for the extensive 
study of concrete chaotic DSs in finite dimensions. 
In particular, we mention here different types of 
bifurcation theories (including the KAM theory and 
the homoclinic bifurcation theory with related 
Shilnikov chaos), the theory of hyperbolic sets, 
stochastic description of deterministic processes, 
Lyapunov exponents and entropy theory, dynamical 
analysis of time series, etc. 

We now turn to infinite-dimensional DSs gener- 
ated by PDEs. A first important difficulty which 
arises here is related to the fact that the analytic 
structure of a PDE is essentially more complicated 
than that of an ODE and, in particular, we do not 
have in general the unique solvability theorem as for 
ODEs, so that even finding the proper phase space 
and the rigorous construction of the associated DS 
can be a highly nontrivial problem. In order to 
indicate the level of difficulties arising here, it 
suffices to recall that, for the three-dimensional 
Navier-Stokes system (which is one of the most 
important equations of mathematical physics), the 
required associated DS has not been constructed yet. 
Nevertheless, there exists a large number of equa- 
tions for which the problem of the global existence 
and uniqueness of a solution has been solved. Thus, 
the question of extending the highly developed 
finite-dimensional DS theory to infinite dimensions 
arises naturally. 

One of the first and most significant results in that 
direction was the development of the theory of 
integrable Hamiltonian systems in infinite dimen- 
sions and the explicit resolution (by inverse-scattering 
methods) of several important conservative equations 


Dissipative Dynamical Systems of Infinite Dimension 103 


of mathematical physics (such as the Korteweg-de 
Vries (and the generalized Kadomtsev-Petiashvilli 
hierarchy), the sine-Gordon, and the nonlinear 
Schródinger equations). Nevertheless, it is worth 
noting that integrability is a very rare phenomenon, 
even among ODEs, and this theory is clearly 
insufficient to understand the dynamics arising in 
PDEs. In particular, there exist many important 
equations which are essentially out of reach of this 
theory. 

One of the most important classes of 
such equations consists of the so-called dissipative 
PDEs which are the main subject of our study. As 
hinted by this denomination, these systems exhibit 
some energy dissipation process (in contrast to 
conservative systems for which the energy is 
preserved) and, of course, in order to have nontrivial 
dynamics, these models should also account for the 
energy income. Roughly speaking, the complicated 
chaotic behaviors in such systems usually arise from 
the interaction of the following mechanisms: 


1. energy dissipation in the higher part of the 
Fourier spectrum; 

2. external energy income in its lower part; 

3. energy flux from lower to higher Fourier modes 
provided by the nonlinear terms of the equation. 


We chose not to give a rigorous definition of a 
dissipative system here (although the concepts of 
energy dissipation and related dissipative systems 
are more or less obvious from the physical point of 
view, they seem too general to have an adequate 
mathematical definition). Instead, we only indicate 
several basic classes of equations of mathematical 
physics which usually exhibit the above behaviors. 

The first example is, of course, the Navier-Stokes 
system, which describes the motion of a viscous 
incompressible fluid in a bounded domain €) (we 
will only consider here the two-dimensional case 
Q C R?, since the adequate formulation in three 
dimensions is still an open problem): 


Ou — (u, V.)u = vAu + Vxp 4 g(x) 6 
div u = 0,u|, 9 = uo, ulan = 0 6 
Here, u(t,x)-—(uij(t,x),u»(t,x) is the unknown 
velocity vector, p — p(t,x) is the unknown pressure, 
A, is the Laplacian with respect to x, v > 0 and g are 
given kinematic viscosity and external forces, 
respectively, and (z,V.)4 is the inertial term 
(Lans V. )u]; — 77, ujOy,uj,i=1,2). The unique global 
solvability of [6] has been proved by Ladyzhenskaya. 
Thus, this equation generates an infinite-dimensional 
DS in the phase space ® of divergence-free square- 
integrable vector fields. 


The second example is the damped nonlinear 
wave equation in Q C R”: 


Oru + yôu — Axu + f(u) =0 


Ulan = 0, H|, 9 = uo, Ou, o = Uy [7] 


which models, for example, the dynamics of a 
Josephson junction driven by a current source 
(sine-Gordon equation). It is known that, under 
natural sign and growth assumptions on the non- 
linear interaction function f, this equation generates 
a DS in the energy phase space E of pairs of 
functions (#, 0,4) such that Ou and V,u are square 
integrable. 

The last class of equations that we will consider 
here consists of reaction-diffusion systems in a 
domain 2 cC R”: 


Ou = aA,u—f(u), u|, = uo [8] 


(endowed with Dirichlet (u|;,— 0) or Neumann 
(O,t4|5q — 0) boundary conditions), which describes 
some chemical reaction in Q. Here, u= (ul, ...,uN) 
is an unknown vector-valued function which 
describes the concentrations of the reactants, f(u) is 
a given interaction function, and a is a diffusion 
matrix. It is known that, under natural assumptions 
on f and a, these equations also generate an infinite- 
dimensional DS, for example, in the phase space 
ps: [LAN 

We emphasize once more that the phase spaces ® in 
all these examples are appropriate infinite-dimensional 
function spaces. Nevertheless, it was observed in 
experiments that, up to a transient process, the 
trajectories of the DS considered are localized inside 
a "very thin" invariant subset of the phase space 
having a complicated geometric structure which, thus, 
accumulates all the nontrivial dynamics of the system. 
It was conjectured a little later that these invariant sets 
are, in some proper sense, finite dimensional and that 
the dynamics restricted to these sets can be effectively 
described by a finite number of parameters. Thus 
(when this conjecture is true), in spite of the infinite- 
dimensional initial phase space, the effective dynamics 
(reduced to this invariant set) is finite dimensional and 
can be studied by using the algorithms and concepts of 
the classical finite-dimensional DS theory. In particu- 
lar, this means that the infinite dimensionality plays 
here only the role of (possibly essential) technical 
difficulties, which cannot, however, produce any new 
dynamical phenomena which are not observed in the 
finite-dimensional theory. 

The above finite-dimensional reduction principle 
of dissipative PDEs in bounded domains has been 
given solid mathematical grounds (based on the 
concept of the so-called global attractor) over the 
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last three decades, starting from the. pioneering 
papers of Ladyzhenskaya. This theory is considered 
in more detail here. 

The finite-dimensional reduction theory has some 
limitations. Of course, the first and most obvious 
restriction of this principle is the effective dimension 
of the reduced finite-dimensional DS. Indeed, it is 
known that, typically, this dimension grows at least 
linearly with respect to the volume vol(O) of the 
spatial domain 2 of the DS considered (and the 
growth of the size of Q is the same (up to a 
rescaling) as the decay of the viscosity coefficient v 
or the diffusion matrix a, see eqns [6]-[8]). So, for 
sufficiently large domains Q, the reduced DS can be 
too large for reasonable investigations. 

The next, less obvious, but much more essential, 
restriction is the growing spatial complexity of the 
DS. Indeed, as shown by Babin-Buinimovich, the 
spatial complexity of the system (e.g., the number of 
topologically different equilibria) grows exponen- 
tially with respect to vol((). Thus, even in the case 
of relatively small dimensions, the reduced system 
can be out of reasonable investigations, due to its 
extremely complicated structure. 

Therefore, the approach based on the finite- 
dimensional reduction does not look so attractive 
for large domains. It seems, instead, more natural, at 
least from the physical point of view, to replace large 
bounded domains by their limit unbounded ones 
(e.g., Q — R" or cylindrical domains). Of course, this 
approach requires a systematic study of dissipative 
DSs associated with PDEs in unbounded domains. 

The dynamical study of PDEs in unbounded 
domains started from the pioneering paper of 
Kolmogorov-Petrovskij-Piskunov, in which the tra- 
veling wave solutions of reaction-diffusion equa- 
tions in a strip were constructed and the 
convergence of the trajectories (for specific initial 
data) to this traveling wave solutions were estab- 
lished. Starting from this, many results on the 
dynamics of PDE in unbounded domains have been 
obtained. However, for a long period, the general 
features of such dynamics remained completely 
unclear. The main problems arising here are: 


1. the essential infinite dimensionality of the DS 
considered (absence of any finite-dimensional 
reduction), which leads to essentially new 
dynamical effects that are not observed in finite- 
dimensional theories; 

2. the additional spatial “unbounded” directions 
lead to the so-called spatial chaos and the 
interaction between spatial and temporal chaotic 
modes generates the spatio-temporal chaos, 
which also has no analog in finite dimensions. 


Nevertheless, several ideas are mentioned in 
the following which (from authors’ point of view) 
were the most important for the development of 
these topics. The first one is the pioneering paper of 
Kirchgassner, in which dynamical methods were 
applied to the study of the spatial structure of 
solutions of elliptic equations in cylinders (which 
can be considered as equilibria equations for 
evolution PDEs in unbounded cylindrical domains). 
The second is the Sinai-Buinimovich model of 
spacetime chaos in discrete lattice DSs. Finally, the 
third is the adaptation of the concept of a global 
attractor to unbounded domains by Abergel and 
Babin- Vishik. 

We note that the situation on the understanding 
of the general features of the dynamics in 
unbounded domains, however, seems to have chan- 
ged in the last several years, due to the works of 
Collet-Eckmann and Zelik. This is the reason why a 
section of this review is devoted to a more detailed 
discussion on this topic. 

Other important questions are the object of 
current studies and we only briefly mention some 
of them. We mention for instance, the study of 
attractors for nonautonomous systems (i.e., sys- 
tems in which the time appears explicitly). This 
situation is much more delicate and is not 
completely understood; notions of attractors for 
such systems have been proposed by Chepyzhov- 
Vishik, Haraux and Kloeden-Schmalfuss. We also 
mention that theories of (global) attractors for 
non-well-posed problems have been proposed by 
Babin-Vishik, Ball, Chepyzhov-Vishik, Melnik- 
Valero, and Sell. 


Global Attractors and Finite-Dimensional 
Reduction 


Global Attractors: The Abstract Setting 


As already mentioned, one of the main concepts of 
the modern theory of DSs in infinite dimensions is 
that of the global attractor. We give below its 
definition for an abstract semigroup S(t) acting on a 
metric space 9, although, without loss of generality, 
the reader may think that (S(t),®) is just a DS 
associated with one of the PDEs ([6]-[8]) described 
in the introduction. 

To this end, we first recall that a subset K of the 
phase space ® is an attracting set of the semigroup 
S(t) 1f it attracts the images of all the bounded subsets 
of o, that is, for every bounded set B and every £ > 0, 
there exists a time T (depending in general on B 
and £) such that the image S(t)B belongs to the 
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e-neighborhood of K if t > T. This property can be 
rewritten in the equivalent form 


lim distz(S(t)B, K) = 0 (9) 


where disti (X, Y) := supxex infyey d(x, y) is the non- 
symmetric Hausdorff distance between subsets of 4. 

The following definition of a global attractor is 
due to Babin-Vishik. 


Definition 1 A set AC 6 is a global attractor for 
the semigroup S(t) if 


(i) A is compact in 6; 
(ii) A is strictly invariant: S(t).A —.A, for all t > 0; 
(iii) A is an attracting set for the semigroup S(t). 


Thus, the second and third properties guarantee 
that a global attractor, if it exists, is unique and that 
the DS reduced to the attractor contains all the 
nontrivial dynamics of the initial system. Further- 
more, the first property indicates that the reduced 
phase space A is indeed “thinner” than the initial 
phase space ® (we recall that, in infinite dimensions, 
a compact set cannot contain, e.g., balls and should 
thus be nowhere dense). 

In most applications, one can use the following 
attractor’s existence theorem. 


Theorem 1 Let a DS (S(t),®) possess a compact 
attracting set and the operators S(t):® — ® be 
continuous for every fixed t. Then, this system 
possesses the global attractor A which is generated 
by all the trajectories of S(t) which are defined for 
all t € R and are globally bounded. 


The strategy for applying this theorem to concrete 
equations of mathematical physics is the following. 
In a first step, one verifies a so-called dissipative 
estimate which has usually the form 


|S(£)uolla € O(lIuolla) e “+ Cx, 


where ||-|| is a norm in the function space ® and the 
positive constants a and C, and the monotonic 
function O are independent of t and uo € ® (usually, 
this estimate follows from energy estimates and is 
sometimes even used in order to “define” a dissipa- 
tive system). This estimate obviously gives the 
existence of an attracting set for S(t) (e.g., the ball 
of radius 2C, in 9), which is, however, noncompact 
in ®. In order to overcome this problem, one usually 
derives, in a second step, a smoothing property for 
the solutions, which can be formulated as follows: 


|S(1)uolls, < O1(l[uolla).; 


where 44, is another function space which is 
compactly embedded into ^. In applications, 4 is 


uo € 中 [10] 


uo E 中 [11] 


usually the space L^(Q) of square integrable func- 
tions, ®; is the Sobolev space H+! (Q) of the functions 
u such that u and Vu belong to L?(Q) and estimate 
[11] is a classical smoothing property for solutions 
of parabolic equations (for hyperbolic equations, a 
slightly more complicated asymptotic smoothing 
property should be used instead of [11]). 

Since the continuity of the operators S(t) usually 
arises no difficulty (if the uniqueness is proven), then 
the above scheme gives indeed the existence of the 
global attractor for most of the PDEs of mathema- 
tical physics in bounded domains. 


Dimension of the Global Attractor 


In this subsection, we start by discussing one of the 
basic questions of the theory: in which sense is 
the dynamics on the global attractor finite dimen- 
sional? As already mentioned, the global attractor 
is usually not a manifold, but has a rather 
complicated geometric structure. So, it is natural to 
use the definitions of dimensions adopted for the 
study of fractal sets here. We restrict ourselves to the 
so-called fractal (or box-counting, entropy) dimen- 
sion, although other dimensions (e.g., Hausdorff, 
Lyapunoy, etc.) are also used in the theory of 
attractors. 

In order to define the fractal dimension, we first 
recall the concept of Kolmogorov's e-entropy, which 
comes from the information theory and plays a 
fundamental role in the theory of DSs in unbounded 
domains considered in the next section. 


Definition 2 Let A be a compact subset of a 
metric space ®. For every € > 0, we define N.(K) as 
the minimal number of e-balls which are necessary 
to cover A. Then, Kolmogorov's e-entropy 
H-(.A) =H.-(A,®) of A is the digital logarithm of 
this number, H.-(A):= log, N.(.A). We recall that 
H.(A) is finite for every s > 0, due to the Hausdorff 
criterium. The fractal dimension d( A) € [0, oc] of A 
is then defined by 


d;(.A) := lim sup H.(A)/ log, 1/e [12] 
e—+0 


We also recall that, although this dimension 
coincides with the usual dimension of the manifold 
for Lipschitz manifolds, it can be noninteger for 
more complicated sets. For instance, the fractal 
dimension of the standard ternary Cantor set in 
[0, 1] is In 2/ In 3. 

The so-called Mané theorem (which can be 
considered as a generalization of the classical Yitni 
embedding theorem for fractal sets) plays an 
important role in the finite-dimensional reduction 
theory. 
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Theorem 2 Let ® be a Banach space and A be a 
compact set such that d;(A) < N for some NEN. 
Then, for “almost all” (2N + 1)-dimensional planes 
L in ®, the corresponding projector IIl; : — L 
restricted to the set A is a Holder continuous 
homeomorphism. 


Thus, if the finite fractal dimensionality of the 
attractor is established, then, fixing a hyperplane L 
satisfying the assumptions of the Mané theorem 
and projecting the attractor A and the DS S(t) 
restricted to A onto this hyperplane (.A:—II;.A 
and $(t):= M; o S(t) o II;'), we obtain, indeed, a 
reduced DS (S(t), A) which is defined on a finite- 
dimensional set A C L ~ R7**!, Moreover, this DS 
will be Hólder continuous with respect to the initial 
data. 


Estimates on the Fractal Dimension 


Obviously, good estimates on the dimension of the 
attractors in terms of the physical parameters are 
crucial for the  finite-dimensional reduction 
described above, and (consequently) there exists a 
highly developed machinery for obtaining such 
estimates. The best-known upper estimates are 
usually obtained by the so-called volume contraction 
method, which is based on the study of the evolution 
of infinitesimal k-dimensional volumes in the neigh- 
borhood of the attractor (and, if the DS considered 
contracts the k-dimensional volumes, then the 
fractal dimension of the attractor is less than k). 
Lower bounds on the dimension are usually based 
on the observation that the global attractor always 
contains the unstable manifolds of the (hyperbolic) 
equilibria. Thus, the instability index of a properly 
constructed equilibrium gives a lower bound on the 
dimension of the attractor. 

In the following, several estimates for the classes 
of equations given in the introduction are formu- 
lated, beginning with the most-studied case of the 
reaction-diffusion system [8]. For this system, sharp 
upper and lower bounds are known, namely 


Cyvol(Q) € d;(.A) € C5vol(€1) [13] 


where the constants C1 and C5 depend on a and f 
(and, possibly, on the shape of (2), but are indepen- 
dent of its size. The same types of estimates also hold 
for the hyperbolic equation [7]. Concerning the 
Navier-Stokes system [6] in general two-dimensional 
domains 2, the asymptotics of the fractal dimension 
as v — 0 is not known. The best-known upper bound 
has the form d;(A) < Cv? and was obtained by 
Foias-Temam by using the so-called Lieb-Thirring 


inequalities. Nevertheless, for periodic boundary 
conditions, Constantin—Foias-Temam and Liu 
obtained upper and lower bounds of the same order 
(up to a logarithmic correction): 


C43 < def A) < Cy 4314+ n(o!) [14] 


Global Lyapunov Functions and the Structure 
of Global Attractors 


Although the global attractor has usually a very 
complicated geometric structure, there exists one 
exceptional class of DS for which the global attractor 
has a relatively simple structure which is completely 
understood, namely the DS having a global Lyapunov 
function. We recall that a continuous function 
£:® — Risa global Lypanov function if 


1. £ is nonincreasing along the trajectories, that is, 
L£(S(t)ug) € £(ug), for all t > 0; 

2. £ is strictly decreasing along all nonequilibrium 
solutions, that is, £(S(t)uo) = £(uo) for some t > 0 
and uo implies that uo is an equilibrium of S(t). 


For instance, in the scalar case N=1, the 
reaction-diffusion mn [8] possess the global 
Lyapunov function Z£(uo):— fo [alVxuo (x)|^ + Fluo 
(x)]dx, where F(v):— Jo f(u) du. Indeed, multiply- 
ing eqn [8] by ðu and integrating over 2, we have 


d 
S£) = -2an 5) 


Analogously, in the scalar case N=1, multiplying 
the hyperbolic equation [7] by 0,u(t) and integrating 
over Q, we obtain the standard global Lyapunov 
function for this equation. 

It is well known that, if a DS posseses a global 
Lyapunov function, then, at least under the generic 
assumption that the set R of equilibria is finite, every 
trajectory u(t) stabilizes to one of these equilibria as 
t 一 十 oo. Moreover, every complete bounded trajec- 
tory u(t), t € R, belonging to the attractor is a 
heteroclinic orbit joining two equilibria. Thus, the 
global attractor .A can be described as follows: 


A= U M* (uo) [16] 


MER 


where M* (uo) is the so-called unstable set of the 
equilibrium uo (which is generated by all heteroclinic 
orbits of the DS which start from the given equilibrium 
uy € A). It is also known that, if the equilibrium uo is 
hyperbolic (generic assumption), then the set M (uo) 
is a k-dimensional submanifold of ®, where « is the 
instability index of zo. Thus, under the generic 
hyperbolicity assumption on the equilibria, the 
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attractor A of a DS having a global Lyapunov function 
is a finite union of smooth finite-dimensional sub- 
manifolds of the phase space ®. These attractors are 
called regular (following Babin-Vishik). 

It is also worth emphasizing that, in contrast to 
general global attractors, regular attractors are 
robust under perturbations. Moreover, in some 
cases, it is also possible to verify the so-called 
transversality conditions (for the intersection of 
stable and unstable manifolds of the equilibria) 
and, thus, verify that the DS considered is a 
Morse-Smale system. In particular, this means that 
the dynamics restricted to the regular attractor .A is 
also preserved (up to homeomorphisms) under 
perturbations. 

A disadvantage of the approach of using a regular 
attractor is the fact that, except for scalar parabolic 
equations in one space dimension, it is usually 
extremely difficult to verify the *generic" hyperbo- 
licity and transversality assumptions for concrete 
values of the physical parameters and the associated 
hyperbolicity constants, as a rule, cannot be 
expressed in terms of these parameters. 


Inertial Manifolds 


It should be noted that the scheme for the finite- 
dimensional reduction described above has essential 
drawbacks. Indeed, the reduced system (S(t),.A) is 
only Hólder continuous and, consequently, cannot 
be realized as a DS generated by a system of ODEs 
(and reasonable conditions on the attractor .A which 
guarantee the Lipschitz continuity of the Mané 
projections are not known). On the other hand, the 
complicated geometric structure of the attractor 
A (or .A) makes the use of this finite-dimensional 
reduction in computations hazardous (in fact, only 
the heuristic information on the number of 
unknowns which are necessary to capture all the 
dynamical effects in approximations can be 
extracted). 

In order to overcome these problems, the concept 
of an inertial manifold (which allows one to embed 
the global attractor into a smooth manifold) has 
been suggested by Foias-Sell-Temam. To be more 
precise, a Lipschitz finite-dimensional manifold M Cc ® 
is an inertial manifold for the DS (S(t), ^) if 


1. M is semiinvariant, that is, S(t)M C M, for all 
t> 0: 

2. M satisfies the following asymptotic completeness 
property: for every uo € ®, there exists vo € M 
such that 


|S(t)uo — S(t)volla < O(luolla)e " —— [17] 


where the positive constant œ and the monotonic 
function O are independent of uo. 

We can see that an inertial manifold, if it 
exists, confirms in a perfect way the heuristic 
conjecture on the finite dimensionality formulated 
in the introduction. Indeed, the dynamics of S(t) 
restricted to an inertial manifold can be, obviously, 
described by a system of ODEs (which is called the 
inertial form of the initial PDE). On the other hand, 
the asymptotic completeness gives (in a very strong 
form) the equivalence of the initial DS (S(t), ®) with 
its inertial form (S(t), M). Moreover, in turbulence, 
the existence of an inertial manifold would yield an 
exact interaction law between the small and large 
structures of the flow. 

Unfortunately, all the known constructions of 
inertial manifolds are based on a very restrictive 
condition, the so-called spectral gap condition, 
which requires arbitrarily large gaps in the spectrum 
of the linearization of the initial PDE and which can 
usually be verified only in one space dimension. So, 
the existence of an inertial manifold is still an 
open problem for many important equations of 
mathematical physics (including in particular the 
two-dimensional Navier-Stokes equations; some 
nonexistence results have also been proven by 
Mallet-Paret). 


Exponential Attractors 


We first recall that Definition 1 of a global 
attractor only guarantees that the images S(t)B of 
all the bounded subsets converge to the attractor, 
without saying anything on the rate of convergence 
(in contrast to inertial manifolds, for which this 
rate of convergence can be controlled). Further- 
more, as elementary examples show, this conver- 
gence can be arbitrarily slow, so that, until now, 
we have no effective way for estimating this rate of 
convergence in terms of the physical parameters of 
the system (an exception is given by the regular 
attractors described earlier for which the rate of 
convergence can be estimated in terms of the 
hyperbolicity constants of the equilibria. However, 
even in this situation, it is usually very difficult to 
estimate these constants for concrete equations). 
Furthermore, there exist many physically relevant 
systems (e.g., the so-called slightly dissipative 
gradient systems) which have trivial global attrac- 
tors, but very rich and physically relevant transient 
dynamics which are automatically forgotten under 
the global-attractor approach. Another important 
problem is the robustness of the global attractor 
under perturbations. In fact, global attractors are 
usually- only upper semicontinuous under 
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perturbations (which means that they cannot 
explode) and the lower semicontinuity (which 
means that they cannot also implode) is much 
more delicate to prove and requires some hyperbo- 
licity assumptions (which are usually impossible to 
verify for concrete equations). 

In order to overcome these difficulties, Eden- 
Foias-Nicolaenko-Temam have introduced an inter- 
mediate object (between inertial manifolds and 
global attractors), namely an exponential attractor 
(also called an inertial set). 


Definition 3 A compact set M C ® is an exponen- 
tial attractor for the DS (S(t), ®) if 


(i) M has finite fractal dimension: df( 人 ) < oo; 
(ii) M is semi-invariant: S(t)M Cc M, for all t > 0; 
(iii) M attracts exponentially the images of all the 

bounded sets B C ó: 


disty(S(t)B,M) < O(lBlls)e^ — — [18] 


where the positive constant a and the monotonic 
function Q are independent of B. 


Thus, on the one hand, an exponential attractor 
remains finite dimensional (like the global attractor) 
and, on the other hand, estimate [18] allows one to 
control the rate of attraction (like an inertial 
manifold). We note, however, that the relaxation 
of strict invariance to semi-invariance allows this 
object to be nonunique. So, we have here the 
problem of the “best choice" of the exponential 
attractor. We also mention that an exponential 
attractor, if it exists, always contains the global 
attractor. 

Although the initial construction of exponential 
attractors is based on the so-called squeezing 
property (and requires Zorn’s lemma), we formulate 
below a simpler construction, due to Efendiev- 
Miranville-Zelik, which is similar to the method 
proposed by Ladyzhenskaya to verify the finite 
dimensionality of global attractors. This is done for 
discrete times and for a DS generated by iterations 
of some map $: 一 4, since the passage from 
discrete to continuous times usually arises no 
difficulty (without loss of generality, the reader 
may think that S= S(1) and (S(t), ®) is one of the DS 
mentioned in the introduction). 


Theorem 3 Let tbe phase space 9o be a closed 
bounded subset of some Banach space H and let H, 
be another Banach space compactly embedded into 
H. Assume also that the map S: — po satisfies 
the following “smoothing” property: 


INT — Su; ||, < K||u —uw2lg, ui,u EEo [19] 


for some constant K independent of u;. Then, the DS 
(S, 55) possesses an exponential attractor. 


In applications, Po is usually a bounded absorb- 
ing/attracting set whose existence is guaranteed by 
the dissipative estimate [10], H:=L*(Q) and 
H,:=H!'(Q). Furthermore, estimate [19] simply 
follows from the classical parabolic smoothing 
property, but now applied to the equation of 
variations (as in [11], hyperbolic equations require 
a slightly more complicated analogue of [19]). These 
simple arguments show that exponential attractors 
are as general as global attractors and, to the 
best of our knowledge, exponential attractors exist 
indeed for all the equations of mathematical physics 
for which the-finite dimensionality of the global 
attractor can be established. Moreover, since A C 
M, this scheme can also be used to prove the finite 
dimensionality of global attractors. 

It is finally worth emphasizing that the control on 
the rate of convergence provided by [18] makes 
exponential attractors much more robust than global 
attractors. In particular, they are upper and lower 
semicontinuous under perturbations (of course, up to 
the “best choice,” since they are not unique), as 
shown by Efendiev-Miranville-Zelik. 


Essentially Infinite-Dimensional 
Dynamical Systems - The Case of 
Unbounded Domains 


As already mentioned in the introduction, the theory 
of dissipative DS in unbounded domains is develop- 
ing only now and the results given here are not as 
complete as for bounded domains. Nevertheless, we 
indicate below several of the most interesting (from 
our point of view) results concerning the general 
description of the dynamics generated by such 
problems by considering a system of reaction- 
diffusion equations [8] in R” with phase space 
$ —L*(R") as a model example (although all the 
results formulated below are general and depend 
weakly on the choice of equation). 


Generalization of the Global Attractor and 
Kolmogorov's ¢-Entropy 


We first note that Definition 1 of a global attractor 
Is too strong for equations in unbounded domains. 
Indeed, as seen earlier, the compactness of the 
attractor is usually based on the compactness of 
the embedding H'(Q) c L?(Q), which does not hold 
in unbounded domains. Furthermore, an attractor, 
in the sense of Definition 1, does not exist for most 
of the interesting examples of eqns [8] in R”. 
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It is natural to use instead the concept of the 
so-called locally compact global attractor which is 
well adapted to unbounded domains. This attrac- 
tor A is only bounded in the phase space 
= L* (R"), but its restrictions Alo to all bounded 
domains €) are compact in L*(Q). Moreover, the 
attraction property should also be understood in 
the sense of a local topology in L^(R"). It is 
known that this generalized global attractor .A 
exists indeed for problem [8] in R" (of course, 
under some “natural” assumptions on the non- 
linearity f and the diffusion matrix a). As for 
bounded domains, its existence is based on the 
dissipative estimate [10], the smoothing property 
[11], and the compactness of the embedding 
HL.(R") C LR") (we need to use the local 
topology only to have this compactness). 

The next natural question that arises here is how 
to control the “size” of the attractor A if its fractal 
dimension is infinite (which is usually the case in 
unbounded domains) One of the most natural 
ways to handle this problem (which was first 
suggested by Chepyzhov-Vishik in the different 
context of uniform attractors associated with 
nonautonomous equations in bounded domains 
and appears as extremely fruitful for the theory of 
dissipative PDE in unbounded domains) is to study 
the asymptotics of Kolmogorov's e-entropy of the 
attractor. Actually, since the attractor A is compact 
only in a local topology, it is natural to study the 
entropy of its restrictions, say, to balls BË of R” of 
radius R centered at xo with respect to the three 
parameters R,xo, and £. A more or less complete 
answer to this question is given by the following 
estimate: 


He(Alpr ) € C(R + log, 1/2)" log; 1/e [20] 


where the constant C is independent of e € 1, R, 
and xo. Moreover, it can be shown that this estimate 
is sharp for all R and © under the very weak 
additional assumption that eqn [8] possesses at least 
one exponentially unstable spatially homogeneous 
equilibrium. 

Thus, formula [20] (whose proof is also based on 
a smoothing property for the equation of variations) 
can be interpreted as a natural generalization of the 
heuristic principle of finite dimensionality of global 
attractors to unbounded domains. It is also worth 
recalling that the entropy of the embedding of a ball 
B, of the space C*(BR ) Into C(B5$ ) has the 
asymptotic 71.(B) ~ Cg(1/2)"/*, which is essentially 
worse than [20]. So, [20] is not based on the 
smoothness of the attractor .A and, therefore, 
reflects deeper properties of the equation. 


Spatial Dynamics and Spatial Chaos 


The next main difference with bounded domains is 
the existence of unbounded spatial directions which 
can generate the so-called spatial chaos (in addition 
to the “usual” temporal chaos arising under the 
evolution). In order to describe this phenomenon, it 
is natural to consider the group {T,,4 € R”} of 
spatial translations acting on the attractor .A: 


(Tyuo)(x) := uo(x +b), T,:.A— A [21] 


as a DS (with multidimensional “times” if n > 1) 
acting on the phase space A and to study its 
dynamical properties. 

In particular, it is worth noting that the lower 
bounds on the ¢-entropy that one can derive imply 
that the topological entropy of this spatial DS is 
infinite and, consequently, the classical symbolic 
dynamics with a finite number of symbols is not 
adequate to clarify the nature of chaos in [21]. 
In order to overcome this difficulty, it was suggested 
by Zelik to use Bernoulli shifts with an infinite 
number of symbols, belonging to the whole interval 
w € [0,1]. To be more precise, let us consider the 
Cartesian product M,,:=[0,1]“ endowed with the 
Tikhonov topology. Then, this set can be interpreted 
as the space of all the functions v: Z" — [0,1], 
endowed with the standard local topology. We define 
a DS (7,,1 € Z"] on M, by 


(Tv)(m) :=v(m4+l), veM,lmez" [22] 


Based on this model, the following description of 
spatial chaos was obtained. 


Theorem 4 Let eqn |8] in Q — R" possess at least 
one exponentially unstable spatially bomogeneous 
equilibrium. Then, there exist a > 0 and a home- 
omorphic embedding T: M, — A such that 


Ty 0o7(v) 2roT,(v, VleZ",veM, [3] 


Thus, the spatial dynamics, restricted to the set 
T(M,), is conjugated to the symbolic dynamics on 
M,. Moreover, there exists a dynamical invariant 
(the so-called mean toplogical dimension) which is 
always finite for the spatial DS [22] and strictly 
positive for the Bernoulli scheme M,. So, the 
embedding [23] clarifies, indeed, the nature of 
chaos arising in the spatial DS [21]. 


Spatio-Temporal Chaos 


To conclude, we briefly discuss an extension of 
Theorem 4, which takes into account the temporal 
modes and, thus, gives a description of the spatio- 
temporal chaos. In order to do so, we first note 
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that the spatial DS [21] commutes obviously 
with the temporal evolution operators S(t) and, 
consequently, an extended (n + 1)-parametric semi- 
group (S(t, b), (t, h) € R} x R”} acts on the attractor: 


S(t, b) = Sz) a Ty, S(t,b): AA 
teR,, be R" [24] 


Then, this semigroup (interpreted as a DS with 
multidimensional times) is responsible for all the 
spatio-temporal dynamical phenomena in the initial 
PDE [8] and, consequently, the question of finding 
adequate dynamical characteristics is of a great 
interest. Moreover, it is also natural to consider the 
subsemigroups Sy, (t, h) associated with the k-dimen- 
sional planes V, of the spacetime R} x R",k « n 4- 1. 

Although finding an adequate description of the 
dynamics of [24] seems to be an extremely difficult 
task, some particular results in this direction have 
already been obtained. Thus, it has been proved by 
Zelik that the semigroup [24] has finite topological 
entropy and the entropy of its subsemigroups 
Sv, (£ b) is usually infinite if k < n+ 1. Moreover 
(adding a natural transport term of the form 
(L, Vx)u to eqn [8]), it was proved that the analog 
of Theorem 4 holds for the subsemigroups Sy, (t, h) 
associated with the n-dimensional hyperplanes V, of 
the spacetime. Thus, the infinite-dimensional Ber- 
noulli shifts introduced in the previous subsection 
can be used to describe the temporal evolution in 
unbounded domains as well. 

In particular, as a consequence of this embedding, 
the topological entropy of the initial purely temporal 
evolution semigroup S(t) is also infinite, which 


indicates that (even without considering the spatial 
directions) we have indeed here essential new levels 
of dynamical complexity which are not observed in 
the classical DS theory of ODEs. 


See also: Dynamical Systems in Mathematical Physics: 
An Illustration from Water Waves; Ergodic Theory; 
Evolution Equations: Linear and Nonlinear; Fractal 
Dimensions in Dynamics; Inviscid flows; Lyapunov 
Exponents and Strange Attractors. 
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Introduction 


Since they were introduced by Witten in 1988, 
topological quantum field theories (TQFTs) have 
had a tremendous impact in mathematical physics 
(see Birmingham et al. (1991) and Cordes et al. for a 
review). These quantum field theories are 


constructed in such a way that the correlation 
functions of certain operators provide topological 
invariants of the spacetime manifold where the 
theory is defined. This means that one can use the 
methods and insights of quantum field theory in 
order to obtain information about topological 
invariants of low-dimensional manifolds. 
Historically, the first TQFT was Donaldson- Witten 
theory, also called topological Yang-Mills theory. 
This theory was constructed by Witten (1998) starting 
from NV — 2 super Yang-Mills by a procedure called 


“topological twisting.” The resulting model is topolo- 
gical and the famous Donaldson invariants of 
4-manifolds are then recovered as certain correlation 
functions in the topological theory. The analysis of 
Witten (1998) did not indicate any new method to 
compute the invariants, but in 1994 the progress in 
understanding the nonperturbative dynamics of N = 2 
theories (Seiberg and Witten 1994 a, b) led to an 
alternative way of computing correlation functions in 
Donaldson—Witten theory. As Witten (1994) showed, 
Donaldson- Witten theory can be reduced to 
another, simpler topological theory consisting of 
a twisted abelian gauge theory coupled to spinor 
fields. This theory leads to a different set of 
4-manifold invariants, the so-called “Seiberg— 
Witten invariants,” and Donaldson invariants can 
be expressed in terms of these invariants through 
Witten’s “magic formula.” The connection 
between Seiberg-Witten and Donaldson invar- 
iants was streamlined and extended by Moore 
and Witten by using the so-called z-plane integral 
(Moore and Witten 1998). This has led to a rather 
complete understanding of Donaldson- Witten the- 
ory from a physical point of view. 

In this article we provide a brief review of 
Donaldson-Witten theory. First, we describe the 
construction of the model, from both a mathematical 
and a physical point of view, and state the main 
results for the Donaldson-Witten generating func- 
tional. In the next section, we present the basic results 
of the z-plane integral of Moore and Witten and 
sketch how it can be used to solve Donaldson- Witten 
theory. In the final section, we mention some 
generalizations of the basic framework. For a 
complete exposition of Donaldson-Witten theory, 
the reader is referred to the book by Labastida 
and Mariño (2005). A short review of the z-plane 
integral can be found in Marino and Moore (19983). 


Donaldson-Witten Theory: Basic 
Construction and Results 


Donaldson-Witten Theory According to Donaldson 


Donaldson theory as formulated in Donaldson (1990), 
Donaldson and Kronheimer (1990), and Friedman and 
Morgan (1991) starts with a principal G —SO(3) 
bundle V — X over a compact, oriented, Riemannian 
4-manifold X, with fixed instanton number k 
and Stiefel- Whitney class w2(V) (SO(3) bundles on a 
4-manifold are classified up to isomorphism by these 
topological data). The moduli space of anti-self-dual 
(ASD) connections is then defined as 


Masp = (A: F*(A) = 0}/G [1] 
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where F*(A) is the self-dual part of the curvature, 
and G is the group of gauge transformations. To 
construct the Donaldson polynomials, one considers 
the universal bundle 


P = (V x A’)/(G x G) p 


where A’ is the space of irreducible G-connections 
on V. This is a G-bundle over B' x X, where 
B* —.A'/G is the space of irreducible connections 
modulo gauge transformations, and as such has a 
Pontrjagin class 


pi(P) € H*(B*) 8 H*(X) i3] 


One can then obtain differential forms on B*' by 
taking the slant product of p;(P) with homology 
classes in X. In this way we obtain the Donaldson 
map: 


p : H(X) — H*"(B') [4] 


After restriction to Masp, we obtain the following 
differential forms on the moduli space of ASD 
connections: 


x € Ho(X) — O(x) € H*(Masp) 


á [5] 
S € H2(X) > h(S) € H (Masp) 
If the manifold X has bı(X) #0, there are also 
cohomology classes associated to 1-cycles and 
3-cycles, but we will not consider them here. 
We can now formally define the Donaldson 
invariants as follows. Consider the space 


A(X) = Sym(Ho(X) ® H2(X)) [6] 


with a typical element written as x'S;---S;. The 
Donaldson invariant corresponding to this element 
of A(X) is the following intersection number: 


‘9 V k 
pe^ ^ (x* Si 7 Si) 


Ls Of ^ b(Si) A+: ^ (Si) [7] 
M asp 


where Masp is the moduli space of ASD connec- 
tions with second Stiefel- Whitney class w2(V) and 
instanton number k. The integral in [7] will be 
different from zero only if the degrees of the forms 
add up to dim(.M Asp). 

It is very convenient to pack all Donaldson 
invariants in a generating functional. Let 
{Si};-1,...6, be a basis of 2-cycles. We introduce the 
formal sum 


bz 
= > viS; [8] 
i=1 
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where v; are complex numbers. We then define the 
Donaldson-Witten generating functional as 


Ze (p, vi) = =e 


where on the right-hand side we are summing over 
all instanton numbers, that is, we are summing over 
all topological configurations of the SO(3) gauge 
field with a fixed w2(V). This gives a formal power 
series in p and vj. 

For b;(X) > 1, the generating functional [9] 
is a diffeomorphism invariant of X; therefore, it 
is potentially a powerful tool in four-dimensional 
topology. When b;(X)=1, Donaldson invariants 
are metric dependent. The metric dependence 
can be described in more detail as follows. Define 
the period point as the harmonic 2-form 
satisfying 


& (eb +8) [9] 


u^ =1 [10] 


Kg — (d. 


which depends on the conformal class of the metric. 
As the conformal class of the metric varies, w 
describes a curve in the cone 


V, = (we H*(X,R) :or > 0) [11] 
Let ¢ € H?(X) satisfy 


¢=w2(V)mod2, (^«0, ((,w)-20 [12] 


Such an element ¢ defines a “wall” in V,: 


= lw: (Gw) = OF [13] 


The complements of these walls are called “cham- 
bers,” and the cone V, is then divided in chambers 
separated by walls. A class ¢ satisfying [12] is the 
first Chern class associated to a reducible solution of 
the ASD equations, and it causes a singularity in 
moduli space: the Donaldson invariants jump when 
we pass through such a wall. Therefore, when 
b;(X)—1, Donaldson invariants are metric inde- 
pendent in each chamber. A basic problem in 
Donaldson—Witten theory is to determine the jump 
in the generating function as we cross a wall, 


Z^ (p. 5) 


The jump term WC.;(p,S,6) is usually called the 
*wall-crossing" term. 

The basic goal of Donaldson theory is to study 
the properties of the generating functional [9] and 
to compute it for different 4-manifolds X. On 
the mathematical side, many results have been 
obtained on Zpw, and some of them can be found 
in Donaldson and Kronheimer (1990), Friedman 
and Morgan (1991), Stern (1998), and Góttsche 


— Z* (p,S) = WCe(p, S) [14] 


(1996). On the other hand, Donaldson theory can be 
formulated as a topological field theory, and many 
of these results can be obtained by using quantum 
field theory techniques. This will be our main focus 
for the rest of the article. 


Donaldson-Witten Theory According to Witten 


Witten (1988) constructed a twisted version of 
A —2 super Yang-Mills theory which has a nilpo- 
tent Becchi-Rouet-Stora-Tyutin (BRST) charge 
(modulo gauge transformations) 


T= “O [15] 


where Q,4 are the supersymmetric (SUSY) charges. 
Here à is a chiral spinor index and A has its origin in 
the SU(2) R-symmetry. The field content of the 
theory is the standard twisted NV = 2 vector multiplet: 


A, V, = Vadis 9, D. Y = V 内 3] V a [16] 


where (1/2)D*, dx" dx" is a self-dual 2-form derived 
from the auxiliary fields, etc. All fields are valued in 
the adjoint representation of the gauge group. After 
twisting, the theory is well defined on any Rieman- 
nian 4-manifold, since the fields are naturally 
interpreted as differential forms and the Q charge 
is a scalar (Witten 1988). 

The observables of the theory are Q cohomology 
classes of operators, and they can be constructed 
from 0-form observables OU using the descent 
procedure. This amounts to solving the equations 


oO) = {O, OV). 


The integration over i-cycles 4! in X of the 
operators ©") is then an observable. These descent 
equations have a canonical solution: the 1-form- 
valued operator Kaa = —iô O 4/4 verifies 


d = {Q,K} [18] 


as a consequence of the supersymmetry algebra. The 
operators ©” = K'O solve the descent equations 
[17] and are canonical representatives. When the 
gauge group is SU(2), the observables are obtained 
by the descent procedure from the operator 


i=0,...,3 . [17] 


O = tr(d”) [19] 
The topological descendant O is given by 
O°) — — hr é(F,, + DL) - 3s) dx ^ dx’ i20] 


and the resulting observable is 


$- [oe 21] 


O and I>(S) correspond to the cohomology classes in 
[5]. One of the main results of Witten (1988) is that 
the semiclassical approximation in the twisted 
N =2 Yang-Mills theory is exact. The semiclassical 
evaluation of correlation functions of the observa- 
bles above leads directly to the definition of 
Donaldson invariants, and the generating functional 
[9] can be written as a correlation function of the 
twisted theory. One then has 


25€ E = (exp (pO + 1,(S))) [22] 


Results for the Donaldson-Witten 
Generating Function 


The basic results that have emerged from the 
physical approach to Donaldson—Witten theory are 
the following. 


1. The Donaldson—Witten generating functional 
is in general the sum of the two terms, 


Zpw = ü + Zw [23] 


(We have omitted the Stiefel-Whitney class for 
convenience.) The first term, Z,, is called the 
“y-plane integral.” It is given by a complicated 
integral over C which can be written, in turn, as an 
integral over a fundamental domain of the con- 
gruence subgroup I?(4) of SL(2,Z). Z, depends 
only on the cohomology ring of X, and therefore 
does not contain any information beyond the one 
provided by classical topology. Finally, Z,, vanishes 
if b3(X) » 1, and it is responsible for the wall- 
crossing behavior of Zpw when 55 (X) — 1. 

2. The second term of [23], Zsw, is called the 
Seiberg-Witten contribution. This contribution 
involves the Seiberg-Witten invariants of X, which 
are obtained by considering the moduli problem 
defined by the Seiberg-Witten monopole equations 
(Witten 1994b): 


+ iV - 
Ft, +4iMa Ma = 0 


[24] 
D2*M 5 = 0 
In these equations, M, is a section of the spinor 
bundle S+ & L', L is the determinant line bundle 
of a Spin, structure on X, Foy oF is the 
self-dual part of the curvature of a U(1) connection 
on L, and Dz is the Dirac operator for the bundle 
S+ & L'?, The solutions of these equations modulo 
gauge equivalence form the moduli space MsW, 
and the Seiberg-Witten invariants are defined by 
integrating suitable differential forms on this moduli 
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space. We will label Spin, structures by the class 
A-—cQ(L!?) e H(X, Z) + w;(X)/2. We say that A is 
a Seiberg-Witten basic class if the corresponding 
Seiberg-Witten invariants are not all zero. If MsW 
is zero dimensional, the Seiberg-Witten invariant 
depends only on the Spin, structure associated to 
A— c«(L!?), and is denoted by SW(). 

3. A manifold X is said to be of Seiberg- 
Witten simple type if all the Seiberg-Witten basic 
classes have a zero-dimensional moduli space. For 
simply connected 4-manifolds of Seiberg-Witten 
simple type and with b3 (X) > 1, Witten determined 
the Seiberg-Witten contribution and proposed the 
following “magic formula" for Zpw (Witten 1994b): 


s F T . 2 2 
Zyw =2'+7x/4+e/4 y glino MX) Le [262(5,N) 
入 


4 了 SW(A) [2.5] 


In this equation, x,c are the Euler characteristic and 
signature of X, respectively, xy —(x +0)/4 is the 
holomorphic Euler characteristic of X, and Ao 
is an integer lifting of w2(V). This formula gen- 
eralizes previous results by Witten (1994a) for 
Kahler manifolds. It also follows from this formula 
that the Donaldson-Witten generating function of 
simply connected 4-manifolds of Seiberg-Witten 
simple type and with b5 (X) > 1 satisfies 


2 


which is the Donaldson simple type condition 
introduced by Kronheimer and Mrowka (1994). 

4. Using the z-plane integral, one can find explicit 
expressions for Zpw in more general situations (like 
non-simply-connected manifolds or manifolds which 
are not of Seiberg-Witten simple type). 


In the next section we explain the formalism of 
the u-plane integral introduced by Moore and 
Witten (1998), which makes possible a detailed 
derivation of the above results. 


The u-Plane Integral 
Definition of the u-Plane Integral 


The evaluation of the Donaldson-Witten generating 
function can be made by using the results of Seiberg 
and Witten (1994 a, b) on the low-energy dynamics 
of SU(2), N =2 Yang-Mills theory. In their work, 
Seiberg and Witten determined the exact low-energy 
effective action of the model up to two derivatives. 
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From a physical point of view, there are certainly 
corrections to this effective action which are difficult 
to evaluate. Fortunately, the computation in the 
twisted version of the theory can be done by just 
considering the Seiberg-Witten effective action. This 
is because the correlation functions in the twisted 
theory are invariant under rescalings of the metric, 
so we can evaluate them in the limit of large 
distances or equivalently of very low energies. The 
effective action up to two derivatives is sufficient for 
that purpose. 

One way of describing the main result of the work 
of Seiberg and Witten is that the moduli space of 
Q-fixed points of the twisted SO(3) N — 2 theory on 
a compact 4-manifold has two branches, which we 
refer to as the Coulomb and Seiberg-Witten 
branches. On the Coulomb branch the expectation 
value 


4, — te) 
167? 

breaks SO(3) —^ U(1) via the standard Higgs 
mechanism. The Coulomb branch is simply a copy 
of the complex u-plane. The low-energy effective 
theory on this branch is simply the abelian M —2 
gauge theory. However, at two points, u= +1, there 
is a singularity where the moduli space meets a 
second branch, the Seiberg-Witten branch. At these 
points, the effective action is given by the magnetic 
dual of the U(1), A —2 gauge theory coupled to a 
monopole matter hypermultiplet. Therefore, this 
branch consists of solutions to the Seiberg- Witten 
equations [24]. 

Since the manifold X is compact, the partition 
function of the twisted theory is a sum over “all” 
vacuum states. Equation [23] then follows. In this 
equation, Z, comes from "integrating over the 
u-plane,” while Zsw corresponds to the points 
u=+1. As we stated before, Z, vanishes for 
manifolds of b;(X) 1, but once this piece has 
been determined an argument originally presented 
at Moore and Witten (1998) allows one to derive 
the form of Zsw as well for arbitrary bj (X) > 1. 

The computation of Z, is presented in detail in 
Moore and Witten (1998). The starting point of 
the computation is the untwisted low-energy 
theory, which has been described in detail in 
Seiberg and Witten (1994 a, b) and Witten 
(1995). It is an A —2 theory characterized by a 
prepotential F which depends on an NV — 2 vector 
multiplet. The effective gauge coupling is given by 
T(a) — F"(a) where a is the scalar component of 
the vector multiplet. The Euclidean Lagrange 
density for the z-plane theory can be obtained 


simply by twisting the physical theory. It can be 
written as 


4 1 一 一 一 一/ 人/ 
z- K‘F (a) +7-{9,F'x(D+F,)} 


t e (9. F'd x v) v2i 


—.3 x25« 
x { Q,F x ax er \ Jad? x 
+ A(u)trR ^ R + B(u)trR AR [26] 


where .A(u),B(u) describe the coupling to gravity, 
and after integration of the corresponding differen- 
tial forms we obtain terms proportional to the 
signature o and Euler. characteristic x of X. The 
data of the low-energy effective action can be 
encoded in an elliptic curve of the form 


y. =x? — ux? cix [27] 


and 7 is the modulus of the curve. The monodromy 
group of this curve is I?(4). All the quantities 
involved in the action can be obtained by integrating 
a certain meromorphic differential on the curve, and 
they can be expressed in terms of modular forms. 

As for the operators, we have u=O(P) by 
definition. We may then obtain the 2-observables 
from the descent procedure. The result is that I($) 一 
I(S) = f; K*u— f; (du/da)(D, + F.) ----. Here D, 
is the auxiliary field. Although one has I(S) — I(S) in 
going from the microscopic theory to the effective 
theory, it does not necessarily follow that 
I(S;)I(S2) — I(S41)(S5) because there can be contact 
terms. If Si and S; intersect, then in passing to the 
low-energy theory we integrate out massive modes. 
This can induce delta function corrections to the 
operator product expansion modifying the mapping 
to the low-energy theory as follows: 


exp(I(S)) — exp(I(S) + S°T(w)) [28] 


where T(x) is the contact term. Such contact terms 
were observed in Witten (1994a) and studied in 
detail in Losev et al. (1998). It can be shown that 


2 5 
Tu =—s Ein (PE) +56 9 


where E»(7) is Eisenstein’s series and da/du is one of 
the periods of the elliptic curve [27]. 

The final result of Moore and Witten is the 
following expression: 


du du 


z,p.S = f xus 


n(r)e2P* S Too qj [30] 


Here, 


= 1-(1/2)x 
p(x) = S E3 Aes 
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[31] 


where y=Im7 and A is the discriminant of the 
curve [27]. The quantity V is essentially a Narain- 
Siegel theta function associated to the lattice 
H? (X, Z). Notice that this lattice is Lorentzian and 
has signature (1,(—1)5 9) (since b;(X) — 1). The 
self-dual projection of a 2-form A can be done with 
the period point w as A, —(A,w)w. The lattice is 
shifted by half the second Stiefel- Whitney class of 
the bundle, w2(V), that is, 


Tr = H*(X,Z)+4w2(V) 


and 
Ų = exp : Bm (=) | e? 全 1 je Àn)te2 (X) 
da i Aer 
i du 
- : 2 .du 
xexp [io mT(A-.) — da (S, A. | [32] 


Here, w2(X) is the second Stiefel- Whitney class 
of X, and Ag is a choice of lifting of w2(V) to 
H?(X,Z). This expression can be extended to the 
non-simply-connected case (see Marino and 
Moore (1999) and Moore and Witten (1998)). 
The study of the z-plane integral leads to a 
systematic derivation of many important results 
in Donaldson-Witten theory. We will discuss in 
detail two such applications, Gottsche's wall- 
crossing formula and Witten's *magic formula." 


Wall-Crossing Formula 


As shown by Moore and Witten, the u-plane integral 
is well defined and does not depend on the period 
point (hence on the metric on X) except for discontin- 
uous behavior at walls. There are two kinds of walls, 
associated, respectively, to the singularities at u = oo 
(the semiclassical region of the underlying Yang—Mills 
theory) and at u = +1, given by 


u =: 4,7 0, € H'(X, Z) - Tivs(V) 


à [33] 
u = +1: A = 0, 和 Ae H?(X,Z)+4w2(X) 


The first type of walls is precisely the one that 
appears in Donaldson theory on manifolds of 
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b;(X)=1. The discontinuity of the z-plane inte- 
gral at these walls can be easily computed from 
eqn [33]: 


WC.-2X(p. S) 
E - Ei) roma ac la hae (n) ^ 0f. 
x exp Dpu...-- S*T4 — i(X, 8) /bac} " [34] 
This expression involves the modular forms 


bæ, fæ tæ, and T (the subscript oc refers to the 
fact that they are computed at the “electric” frame 
which is appropriate for the Seiberg-Witten curve at 
u — oc). They can be written in terms of Jacobi 
theta functions J;(g), with g=e?"", and their 
explicit expression is 


h..(q) = +¥2(q)03(q) 


- 02(q)03(q) 


_ (gq) +939) [55] 
2 (2 (q)9s(q))" 


| Ex(q) | 1 
Tx(q) = 24 b2. (q) T 3 ^x (4) 

The subindex 4 means that in the expansion in q of 
the modular forms, we pick the constant term. The 
formula [34] agrees with the formula of Gottsche 
(1996) for the wall crossing of the Donaldson- 
Witten generating functional. 


The Seiberg-Witten Contribution and Witten's 
Magic Formula 


At u=+1,Z, jumps at the second type of 
walls [33], which are called Seiberg-Witten (SW) 
walls. In fact, these walls are labeled by classes 
A € H*(X;Z) + (1/2)w2(X), which correspond to 
Spin, structures on X. At these walls, the Seiberg- 
Witten invariants have wall-crossing behavior. Since 
the Donaldson polynomials do not jump at SW 
walls, it must happen that the change of Z, at u= +1 
is canceled by the change of Zsw. As shown by 
Moore and Witten, this actually allows one 
to obtain a precise expression for Zsw for general 
4-manifolds of b5(X) > 1. 

On general grounds, Zsw is given by the sum of 
the generating functionals at 4 — +1. These involve 
a magnetic U(1), M — 2 vector multiplet coupled to a 
hypermultiplet (the monopole field). The twisted 
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Lagrangian for such a system involves the magnetic 
prepotential Fp(ap), and it can be written as 


{O, ui mar z— pF ^F + plu )trR ^ «R 


iV2 dtp 
+ &(u)trR ^ R — 327 "rm (ab Aw) ^ 


i 


da dg, PADNA [36] 


where 7p =F p(ap). - the cancellation of wall 
crossings, one can actually compute the functions 
Folap), plu), €(u) and determine the precise form of 
the Seiberg-Witten contributions. One finds that a 
Spin, structure À at 4 — 1 gives the following contribu- 
tion to the Donaldson- Witten generating functional: 


= SW (A) AE ry MEM 
déc. 21T AD 
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x exp(2pum + i(A, S) / by + PT) [37] 


0 
dp 


Here, ap, bhm, um, and Ty are modular forms 
that can be expressed as well in terms of Jacobi 
theta functions VY;(gp), where gp = exp (2riTp). 
The subscript M refers to the monopole point, 
and they are related by an S-transformation 
to the quantities obtained in the “electric” 
frame at 4 — oo. Their explicit expression is 


|. i2Ex(ap) — ¥3(qp) — 9s (ap) 
an(qo)— —$ asap yup) 


hu(qp) = --03(40)94(qo) 


103(qp) + 04 (4p) 38 
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The contribution at wu=-—1 is related to the 
contribution at 4 — 1 by a 4 — —u symmetry: 


Z,-1(p,S) = PAZ (—p,—iS) [39 


If the manifold has bj (X) > 1 and is of Seiberg- 
Witten simple type, [37] reduces to 
PE cas emcee 2p+S*/2 .—2(5,A) 


eTA- ASW A) [40] 
This leads to Witten’s “magic formula” [25] which 
expresses the Donaldson invariants in terms of 
Seiberg-Witten invariants. 


Other Applications of the u-Plane Integral 


The u-plane integral makes possible to derive other 
results on the Donaldson—Witten generating 
functional. 

The blow-up formula. This relates the function 
Zpw on X to Zpw on the blown-up manifold X 
The u-plane integral leads directly to the general 
blow-up formula of Fintushel and Stern (1996). 

Direct evaluations. The u-plane integral can be 
evaluated directly in many cases, and this leads to 
explicit formulas for the Donaldson-Witten generat- 
ing functional of certain 4-manifolds with 55 (X) — 1, 
on certain chambers, and in terms of modular forms. 
For example, there are explicit formulas for the 
Donaldson—Witten generating functional of product 
ruled surfaces of the form S* x X, in the limiting 
chambers in which $^ or X, are very small (Moore 
and Witten 1998, Mariño at Moore 1999). Moore 
and Witten (1998) have also derived an explicit 
formula for the Donaldson invariants of CP? in terms 
of Hurwitz class numbers. 


Extensions of Donaldson-Witten Theory 


Donaldson-Witten theory is a twisted version of 
SU(2), N=2 Yang-Mills theory. The twisting of 
more general  — 2 gauge theories, involving other 
gauge groups and/or matter content, leads to other 
topological field theories that give interesting gen- 
eralizations of Donaldson-Witten theory. We now 
briefly list some of these extensions and their most 
important properties. 

Higher-rank | theories. The extension of 
Donaldson-Witten to other gauge groups has been 
studied in detail in Marino and Moore (1998b) and 
Losev et al. (1998). One can study the higher-rank 
generalization of the z-plane integral, and as shown 
in Marino and Morre (1998b), this leads to a fairly 
explicit formula for the Donaldson-Witten generat- 
ing function in the SU(N) case, for manifolds with 
bj > 1 and of Seiberg-Witten simple type. Mathe- 
matically, higher-rank generalizations of Donaldson 
theory turn out to be much more complicated, but 
they can be studied. In particular, higher-rank 
generalizations of the Donaldson invariants can be 
defined and computed (Kronheimer 2004), and the 
results so far agree with the predictions of Marino 
and Moore (1998b). Unfortunately it seems that 
these higher-rank generalizations do not contain 
new topological information, besides the one 
encoded in the Seiberg-Witten invariants. 

Theories with matter. Twisted SU(2), NV —2 
theories with hypermultiplets lead to generalizations 
of Donaldson—Witten theory involving nonabelian 
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monopole equations (see Marino (1997) and 
Labastida and Mariño (2005) for a review of these 
models and some of their properties). The -plane 
integral leads to explicit formulas for the generating 
functionals of these theories, which for manifolds of 
bł > 1 can be written in terms of Seiberg-Witten 
invariants. Again, no new topological information 
seems to be encoded in these theories. One can 
however exploit new physical phenomena arising in 
the theories with hypermultiplets (in particular, the 
presence of superconformal points) to obtain new 
information about the Seiberg-Witten invariants 
(see Marifio et al. (1999) for these developments). 
Vafa-Witten theory. The so-called Vafa-Witten 
theory is a close cousin of Donaldson- Witten theory, 
and was introduced by Vafa and Witten (1994) as a 
topological twist of N=4 Yang-Mills theory. In 
some cases, the partition function of this theory 
counts the Euler characteristic of the moduli space of 
instantons on the 4-manifold X. For a review of some 
properties of this theory, see Lozano (1999). 


See also: Duality in Topological Quantum Field Theory; 
Mathai-Quillen Formalism; Seiberg-Witten Theory; 
Topological Quantum Field Theory: Overview. 
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Introduction 


There have been many exciting interactions between 
physics and mathematics in the past few decades. 
A prominent role in these interactions has been 
played by certain field theories, known as topologi- 
cal quantum field theories (TQFTs). These are 
quantum field theories whose correlation functions 
are metric independent and, in fact, compute certain 
mathematical invariants (Birmingham et al. 1991, 
Cordes et al. 1996, Labastida and Lozano 1998). 

Well-known examples of TQFTs are, in two 
dimensions, the topological sigma models (Witten 
1988a), which are related to Gromov—Witten invar- 
iants and enumerative geometry; in three dimen- 
sions, Chern-Simons theory (Witten 1989), which is 
related to knot and link invariants; and in four 
dimensions, topological Yang-Mills theory (or 
Donaldson-Witten theory) (Witten 1988b), which 
is related to the Donaldson invariants. The two- and 
four-dimensional theories above are examples of 
cohomological (also Witten-type) TQFTs. As such, 
they are related to an underlying supersymmetric 
quantum field theory (the V —2 nonlinear sigma 
model, and the N=2 supersymmetric Yang-Mills 
theory, respectively) and there is no difference 
between the topological and the standard version 
on flat space. However, when one considers curved 
spaces, the topological version differs from the 
supersymmetric theory on flat space in that some 
of the fields have modified Lorentz transformation 
properties (spins). This unconventional spin assign- 
ment is also known as twisting, and it comes about 
basically to preserve supersymmetry on curved 
space. In fact, the twisting gives rise to at least one 
nilpotent scalar supercharge Q, which is a certain 
linear combination of the original (spinor) super- 
symmetry generators. 


In these theories the energy momentum tensor is 


O-exact, that is, 
Tw p {Q, Aww} 


for some A,,,, which (barring potential anomalies) 
leads to the statement that the correlation functions 
of operators in the cohomology of Q are all metric 
independent. Furthermore, the corresponding path 
integrals are localized to field configurations that are 
annihilated by Q, and this typically leads to some 


moduli problem related to the computation of 
certain mathematical invariants. 

On the other hand, in Chern-Simons theory, as a 
representative of the so-called Schwarz-type topolo- 
gical theories, the topological character is manifest: 
one starts with an action which is explicitly 
independent of the metric on the 3-manifold, and 
thus correlation functions of metric-independent 
operators are topological invariants as long as 
quantization does not introduce any undesired 
metric dependence. 

Even though the primary motivation for introdu- 
cing TQFTs may be to shed light onto awkward 
mathematical problems, they have proved to be a 
valuable tool to gain insight into many questions of 
interest in physics as well. One such question where 
TQFTs can (and in fact do) play a role is duality. In 
what follows, an overview of the manifestations of 
duality is provided in the context of TQFTs. 


Duality 


The notion of duality is at the heart of some of the 
most striking recent breakthroughs in physics and 
mathematics. In broad terms, a duality (in physics) 
is an equivalence between different (and often 
complementary) descriptions of the same physical 
system. The prototypical example is electric- 
magnetic (abelian) duality. Other, more sophisti- 
cated, examples are the various string-theory 
dualities, such as T-duality (and its more specialized 
realization, mirror symmetry) and strong/weak 
coupling S-duality, as well as field theory dualities 
such as Montonen-Olive duality and Seiberg- Witten 
effective duality. 

Also, the original 't Hooft conjecture, stating that 
SU(N) gauge theories are equivalent (or dual), at 
large N, to string theories, has recently been revived 
by Maldacena (1998) by explicitly identifying the 
string-theory duals of certain (supersymmetric) 
gauge theories. 

One could wonder whether similar duality sym- 
metries work for TQFTs as well. As noted in the 
following, this is indeed the case. 

In two dimensions, topological sigma models 
come under two different versions, known as types 
A and B, respectively, which correspond to the 
two different ways in which M =2 supersymmetry 
can be twisted in two dimensions. Computations 
in each model localize on different moduli spaces 
and, for a given target manifold, give different 
results, but it turns out that if one considers 
mirror pairs of | Calabi-Yau manifolds, 


computations in one manifold with the A-model are 
equivalent to computations in the mirror manifold 
with the B-model. 

Also, in three dimensions, a program has been 
initiated to explore the duality between large N 
Chern-Simons gauge theory and topological strings, 
thereby establishing a link between enumerative 
geometry and knot and link invariants 
(Gopakumar and Vafa 1998). 

Perhaps the most impressive consequences of the 
interplay between duality and TQFTs have come out 
in four dimensions, on which we will focus in what 
follows. 


Duality in Twisted V —2 Theories 


As mentioned above, topological Yang-Mills theory 
(or Donaldson- Witten theory) can be constructed by 
twisting the pure N =2 supersymmetric Yang-Mills 
theory with gauge group SU(2). This theory contains 
a gauge field A, a pair of chiral spinors A1, A2, and a 
complex scalar field B. The twisted theory contains a 
gauge field A, bosonic scalars A, ó, a Grassman-odd 
scalar n, a Grassman-odd vector v», and a Grassman- 
odd self-dual 2-form x. 

On a 4-manifold X, and for gauge group G, the 
twisted action has the form 


S =f d'x Jg tr (Fe = ix" Dupo T nD" 
X 
1 i 
* 391 Xv: x" te FMY y^ — AD ,D" 4 
i 1 
*390n n} ur at) [1] 


where F* is the self-dual part of the Yang-Mills field 
strength F. The action [1] is invariant under the 
transformations generated by the scalar supercharge Q: 


{Q,A,,} = V; (9. Xi] = P 
(9. vj = dad, {Q.n}=i[r,¢] B] 
{O,¢} = 0, (QA =y 


In these transformations, Q^ is a gauge transforma- 
tion with gauge parameter ¢, modulo field equa- 
tions. Observables are, therefore, related to the 
G-equivariant cohomology of Q (i.e., the cohomol- 
ogy of Q restricted to gauge invariant operators). 
Auxiliary fields can be introduced so that the 
action [1] is Q-exact, that is, 


$ —1(9. Aj [3] 


for A a certain functional of the fields of the theory 
which comes under the name of gauge fermion, a 


Duality in Topological Quantum Field Theory 119 


BRST-inspired terminology which reflects the formal 
resemblance of topological cohomological field 
theories with some aspects of the BRST approach 
to the quantization of gauge theories. Before con- 
structing the topological observables of the theory, 
we begin by pointing out that for each independent 
Casimir of the gauge group G it is possible to 
construct an operator Wo, from which operators W; 
can be defined recursively through the descent 
equations (Q, W;] - dW; ;. For example, for the 
quadratic Casimir, 


E tr(¢") 4] 


which generates the following family of operators: 


Wo = 


1 

Wi 一 4,2 tov) 
1 1 

W^; = gatt(SvAv+onF) [5] 
1 | 

W3 = ai uU A F) 

Using these one defines the following observables: 
Qu = / WA [6| 
Yk 


where y € H4(X) is a k-cycle on the 4-manifold X. 
The descent equations imply that they are Q-closed 
and depend only on the homology class of ^. 
Topological invariants are constructed by taking 
vacuum expectation values of products of the 


operators OU. 


(NO - oy 


- | 9062... gen sie [7] 


where the integration has to be understood on the 
space of field configurations modulo gauge transfor- 
mations, and e is a coupling constant. Standard 
arguments show that due to the Q-exactness of the 
action S, the quantities obtained in [7] are indepen- 
dent of e. This implies that the observables of the 
theory can be obtained either in the weak-coupling 
limit e — 0 (also short-distance or ultraviolet regime, 
since the N — 2 theory is asymptotically free), where 
perturbative methods apply, or in the strong-coupling 
(also long-distance or infrared) limit e — oc, where 
one is forced to consider a nonperturbative approach. 

In the weak-coupling limit one proves that 
the correlation functions [7] descend to polynomials 
in the product cohomology of the moduli space 
of anti-self-dual (ASD) instantons Hg (Masp) x 
Hi, (Masp) x +>: x Hi, (Masp), which are precisely 
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the Donaldson polynomial invariants of X. How- 
ever, the weak- coupling analysis does not add any 
new ingredient to the problem of the actual 
computation of the invariants. The difficulties that 
one has to face in the field theory representation are 
similar to those in ordinary Donaldson theory. 

Nevertheless, the field theory connection is very 
important since in this theory the strong- and weak- 
coupling limits are exact, and therefore the door is 
open to find a strong-coupling description which 
could lead to a new, simpler representation for the 
Donaldson invariants. 

This alternative strategy was pursued by Witten 
(1994a), who found the strong-coupling realization 
of the Donaldson—Witten theory after using the 
results on the strong-coupling behavior of N =2 
supersymmetric gauge theories which he and Seiberg 
(Seiberg and Witten 1994a-c) had discovered. The 
key ingredient in Witten's derivation was to assume 
that the strong-coupling limit of Donaldson-Witten 
theory is equivalent to the *sum" over the twisted 
effective low-energy descriptions of the correspond- 
ing N — 2 physical theory. This “sum” is not entirely 
a sum, as in general it has a part which contains a 
continuous integral. The *sum" is now known as 
integration over the z-plane after the work of Moore 
and Witten (1998). Witten's (1994a) assumption 
can be simply stated as saying that the weak-/ 
strong-coupling limit and the twist commute. In 
other words, to study the strong-coupling limit of 
the topological theory, one first untwists, then 
works out the strong-coupling limit of the physical 
theory and, finally, one twists back. From such a 
viewpoint, the twisted effective. (strong-coupling) 
theory can be regarded as a TQFT dual to the 
original one. In addition, one could ask for the dual 
moduli problem associated to this dual TQFT. It 
turns out that in many interesting situations 
(b; (X) > 1) the dual moduli space is an abelian 
system corresponding to the Seiberg-Witten or 
monopole equations (Witten 1994a). The topologi- 
cal invariants associated with this new moduli space 
are the celebrated Seiberg-Witten invariants. 

Generalizations of Donaldson-Witten theory, with 
either different gauge groups and/or additional matter 
content (such as, e.g., twisted N=2 Yang-Mills 
multiplets coupled to twisted N=2 matter multi- 
plets) are possible, and some of the possibilities have 
in fact been explored (see Moore and Witten (1998) 
and references therein). The main conclusion that 
emerges from these analyses is that, in all known 
cases, the relevant topological information is cap- 
tured by the Seiberg- Witten invariants, irrespectively 
of the gauge group and matter content of the theory 
under consideration. These cases are not reviewed 


here, but rather the attention is turned to the twisted 
theories which emerge from V —4 supersymmetric 
gauge theories. 


Duality in Twisted M —4 Theories 


Unlike the V =2 supersymmetric case, the M —4 
supersymmetric Yang-Mills theory in four dimen- 
sions is unique once the gauge group G is fixed. 
The microscopic theory contains a gauge or gluon 
field, four chiral spinors (the gluinos) and six real 
scalars. All. these fields are massless and take 
values in the adjoint representation of the gauge 
group. The theory is finite and conformally 
invariant, and is conjectured to have a duality 
symmetry exchanging strong and weak coupling 
and exchanging electric and magnetic fields, which 
extends to a full SL(2, Z) symmetry acting on the 
microscopic complexified coupling (Montonen and 
Olive 1977) 
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As in the NV —2 case, the MN —4 theory can be 
twisted to obtain a topological model, only that, in 
this case, the topological twist can be performed in 
three inequivalent ways, giving rise to three different 
TQFTs (Vafa and Witten 1994). A natural question 
to answer is whether the duality properties of the 
N =4 theory are shared by its twisted counterparts 
and, if so, whether one can take advantage of the 
calculability of topological theories to shed some 
light on the behavior and properties of duality. 

The answer is affirmative, but it is instructive to 
clarify a few points. First, as mentioned above, the 
topological observables in twisted V — 2 theories are 
independent of the coupling constant e, so the 
question arises as to how the twisted A = 4 theories 
come to depend on the coupling constant. As it turns 
out, twisted N=2 supersymmetric gauge theories 
have an off-shell formulation such that the TOFT 
action can be expressed as a Q-exact expression, 
where Q is the generator of the topological 
symmetry. Actually, this is true only up to a 
topological 0-term |, tr(F AF), 


1 
= zz Vgd’ x{Q,A)} 


。 | 
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for some A. However, the N =2 supersymmetric 
gauge theories possess a global U(1) chiral symmetry 
which is generically anomalous, so one can actually 


get rid of the 0-term with a chiral rotation. As a 
result of this, the observables in the topological 
theory are insensitive to t-terms (and hence to 7 
and e) up to a rescaling. 

On the other hand, in M —4 supersymmetric 
gauge theories 6-terms are observable. There is no 
chiral anomaly and these terms cannot be shifted 
away as in the A/ —2 case. This means that in the 
twisted theories one might have a dependence on the 
coupling constant 7, and that — up to anomalies — 
this dependence should be holomorphic (resp. 
antiholomorphic if one reverses the orientation of 
the 4-manifold). In fact, on general grounds, one 
would expect for the partition functions of the 
twisted theories on a 4-manifold X and for gauge 
group G to take the generic form 


Zx(G) =g 99 NA [10] 
k 


where q — e??", c is a universal constant (depending 
on X and G), k — (1/167?) f, tr(F ^ F) is the 
instanton number, and y(M,) encodes the topolo- 
gical information corresponding to a sector of the 
moduli space of the theory with instanton number k. 

Now we can be more precise as to how we expect 
to see the Montonen-Olive duality in the twisted 
N =4 theories. First, under 7 — —1/7 the gauge 
group G gets exchanged with its dual group 
G. Correspondingly, the partition functions should 
behave as modular forms 


Zc(-1/r) = K(X, Gy" Zel7) [11] 


where « is a constant (depending on X and G), and 
the modular weight w should depend on X in such a 
way that it vanishes on flat space. 

In addition to this, in the A —4 theory all the 
fields take values in the adjoint representation of G. 
Hence, if H?(X,71(G)) Æ 0, it is possible to consider 
nontrivial G/Center(G) gauge configurations with 
discrete magnetic 't Hooft flüx through the 2-cycles 
of X. In fact, G/Center(G) bundles on X are 
classified by the instanton number and a character- 
istic class v € H*(X,7(G)). For example, if 
G=SU(2), we have G=SU(2)/Z. = SO(3) and v is 
the second Stiefel- Whitney class w2(E) of the gauge 
bundle E. This Stiefel- Whitney class can be repre- 
sented in de Rham cohomology by a class in 
H?(X, Z) defined modulo 2, that is, w2(E) and 
w(E) + 2w, with w € H?(X, Z), represent the same 
t Hooft flux, so if w2(E)=2A, for some AE 
H?(X, Z), then the gauge configuration is trivial in 
SO(3) (it has no 't Hooft flux). 

Similarly, for G=SU(N) (for which G=SU 
(N)/Zn), one can fix fluxes in H?(X, Zn) (the 
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corresponding Stiefel-Whitney class is defined mod- 
ulo N). One has, therefore, a family of partition 
functions Z,(7), one for each magnetic flux v. The 
SU(N) partition function is obtained by considering 
the zero flux partition function (up to a constant 
factor), while the (dual) SU(N)/Zy partition func- 
tion is obtained by summing over all v, and both are 
to be exchanged under 7 — —1/7. The action of 
SL(2, Z) on the Z, should be compatible with this 
exchange, and thus the 7 — —1/7 operation mixes 


the Z, by a discrete Fourier transform which, for 
G — SU(N) reads 


Zy(—1)v) (XY eva) m2) 


We are now in a position to examine the (three) 
twisted theories in some detail. For further details 
and references, the reader is referred to Lozano 
(1999). 

The first twisted theory considered here possesses 
only one scalar supercharge (and hence comes 
under the name of “half-twisted theory"). It is a 
nonabelian generalization of the Seiberg-Witten 
abelian monopole theory, but with the monopole 
multiplets taking values in the adjoint representa- 
tion of the gauge group. The theory can be 
perturbed by giving masses to the monopole multi- 
plets while still retaining its topological character. 
The resulting theory is the twisted version of the 
mass-deformed MN —4 theory, which preserves 
N — 2 supersymmetry and whose low-energy effec- 
tive description is known. This connection with 
A —2 theories, and its topological character, 
makes it possible to go to the long-distance limit 
and compute in terms of the twisted version of the 
low-energy effective description of the supersym- 
metric theory. Below, we review how the z-plane 
approach works for gauge group SU(2). 

The twisted theory for gauge group SU(2) has a 
U(1) global symmetry (the ghost number) which 
has an anomaly —3(2x +30)/4 on gravitational 
backgrounds (i.e., on curved manifolds). Nontrivial 
topological invariants are thus obtained by con- 
sidering the vacuum expectation value of products 
of observables with ghost numbers adding up to 
—3(2x + 3e)/4. The relevant observables for this 
theory and gauge group SU(2) or SO(3) are 
precisely the same as in the Donaldson-Witten 
theory (eqns [4] and [5]). In addition to this, it is 
possible to enrich the theory by including sectors 
with nontrivial nonabelian electric and magnetic ’t 
Hooft fluxes which, as pointed out above, should 
behave under SL(2, Z) duality in a well-defined 
fashion. 
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The generating function for these correlation 
functions is given as an integration over the moduli 
space of vacua of the physical theory (the z-plane), 
which, for generic values of the mass parameter, 
forms a one-dimensional complex compact manifold 
(described by a complex variable customarily 
denoted by u, hence the name), which parametrizes 
a family of elliptic curves that encodes all the 
relevant information about the low-energy effective 
description of the theory. At a generic point in the 
moduli space of vacua, the only contribution to the 
topological correlation functions comes from a 
twisted M —2 abelian vector multiplet. Additional 
contributions come from points in the moduli space 
where the low-energy effective description is singu- 
lar (ie., where the associated elliptic curve 
degenerates). 

Therefore, the total contribution to the generating 
function thus consists of an integration over the 
moduli space with the singularities removed — which 
is nonvanishing for b;(X)—1 (Moore and Witten 
1998) only — plus a discrete sum over the contribu- 
tions of the twisted effective theories at each of the 
three singularities of the low-energy effective 
description (Seiberg and Witten 1994a, b, c). The 
effective theory at a given singularity contains, 
together with the appropriate dual photon multiplet, 
one charged hypermultiplet, which corresponds to 
the state becoming massless at the singularity. The 
complete effective action for these massless states 
also contains certain measure factors and contact 
terms among the observables, which reproduce the 
effect of the massive states that have been integrated 
out as well as incorporate the coupling to gravity 
(i.e., explicit nonminimal couplings to the metric of 
the 4-manifold). How to determine these a priori 
unknown functions was explained in Moore and 
Witten (1998). The idea is as follows. At points on 
the z-plane where the (imaginary part of the) 
effective coupling diverges, the integral is discontin- 
uous at anti-self-dual abelian gauge configurations. 
This is commonly referred to as “wall crossing." 
Wall crossing can take place at the singularities of 
the moduli space — the appropriate local effective 
coupling 7,54 diverges there — and, in the case of the 
asymptotically free theories, at the point at infinity 一 
the effective electric coupling diverges owing to 
asymptotic freedom. 

On the other hand, the final expression for the 
invariants can exhibit a wall-crossing behavior at 
most at u — oo, so the contribution to wall crossing 
from the integral at the singularities at finite values 
of 4 must cancel against the contributions coming 
from the effective theories there, which also dis- 
play wall-crossing discontinuities. Imposing this 


cancelation fixes almost completely the unknown 
functions in the contributions to the topological 
correlation functions from the singularities. The 
final result for the contributions from the singula- 
rities (which give the complete answer for the 
correlation functions when 55(X) > 1) is written 
explicitly and completely in terms of the funda- 
mental periods da/du (written in the appropriate 
local variables) and the discriminant of the elliptic 
curve comprising the Seiberg-Witten solution for 
the physical theory. For simply connected spin 
4-manifolds of simple type the generating function 
is given by 
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where x is a Seiberg-Witten basic class (and ny is 
the corresponding Seiberg-Witten invariant), m is 
the mass parameter of the theory, v = (x + c)/4, v € 
H*^(X, Z3) is a "t Hooft flux, S is the formal 
sum S= 3404X, (and, correspondingly, I(S) — 5^, o; 
I(3), with I(3;)— Js, W2), where {aba = IR 
form a basis of H5(X) and a, are constant parameters, 
while 5(r) is the Dedekind function, x;-— (du/ 
Weft) uv =u, (with deff = €xp (271T eff), and Teff 1S the 
ratio of the fundamental periods of the elliptic curve), 
and the contact terms T; have the form 


1 /du\? ; m. 
i= = (=) 4- E3(7) 4+ = Ea(7) [14] 


with E» and E4 the Einstein series of weights 2 and 
4, respectively. Evaluating the quantities in [13] 
gives the final result as a function of the physical 
parameters 7 and m, and of topological data of X as 
the Euler characteristic y, the signature ø and the 
basic classes x. The expression [13] has to be 
understood as a formal power series in p and ag, 
whose coefficients give the vacuum expectation 
values of products of O= Wo and I(X,). 


The generating function [13] has nice properties 
under the modular group. For the partition function Zy, 


Z,(r 4- 1) = (-1)"/5i7" Z, (7) 
Z(-1/7) = 2 P (-1)7 y [15] 
x $ (-1)""Z,(r) 


Also, with ZSU() =217,,—0 and Z50(3) = Pas Miss 


Zsuoy(r + 1) = (-1)" Zsuo)(7) 

Z50(3)(T + 2) = Zso(3)(7) [16] 

Zsuiay(—1/7) = (71) 2X" XP? Zo (r) 
Notice that the last of these three equations 
corresponds precisely to the strong-weak coupling 
duality transformation conjectured by Montonen 
and Olive (1977). 


As for the correlation functions, one finds the 
following behavior under the inversion of the coupling: 
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Therefore, as expected, the partition function of 
the twisted theory transforms as a modular form, 
while the topological correlation functions turn out 
to transform covariantly under SL(2, Z), following a 
pattern which can be reproduced with a far more 
simple topological abelian model. 

The second example considered next is the Vafa- 
Witten (1994) theory. This theory possesses two 
scalar supercharges, and has the unusual feature that 
the virtual dimension of its moduli space is exactly 
zero (it is an example of balanced TQFT), and 
therefore the only nontrivial topological observable 
is the partition function itself. Furthermore, the 
twisted theory does not contain spinors, so it is well 
defined on any compact, oriented 4-manifold. 

Now this theory computes, with the subtleties 
explained in Vafa and Witten (1994), the Euler 
characteristic of instanton moduli spaces. In fact, in 
this case in the generic partition function [10], 


Zx(G) = q (9) gf y(M,) [18] 
k 
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x(M;,) is the Euler characteristic of a suitable 
compactification of the kth instanton moduli space 
Mg of gauge group G in X. 

As in the previous example, it is possible to consider 
nontrivial gauge configurations in G/Center (G) and 
compute the partition function for a fixed value of the 
't Hooft flux v € H? (X, mı(G)). In this case, however, 
the Seiberg-Witten approach is not available, but, as 
conjectured by Vafa and Witten, one can nevertheless 
carry out computations in terms of the vacuum degrees 
of freedom of the V — 1 theory which results from 
giving bare masses to all the three chiral multiplets of 
the N —4 theory. It should be noted that a similar 
approach was introduced by Witten (1994b) to obtain 
the first explicit results for the Donaldson-Witten 
theory just before the far more powerful Seiberg- 
Witten approach was available. 

As explained in detail by Vafa and Witten (1994), 
the twisted massive theory is topological on Kahler 
4-manifolds with 5^" 4 0, and the partition func- 
tion is actually invariant under the perturbation. In 
the long-distance limit, the partition function is 
given as a finite sum over the contributions of the 
discrete massive vacua of the resulting V = 1 theory. 
In the case at hand, it turns that, for G — SU(N), the 
number of such vacua is given by the sum of the 
positive divisors of N. The contribution of each 
vacuum is universal (because of the mass gap), and 
can be fixed by comparing with known mathema- 
tical results (Vafa and Witten 1994). However, this 
is not the end of the story. In the twisted theory, the 
chiral superfields of the N — 4 theory are no longer 
scalars, so the mass terms cannot be invariant under 
the holonomy group of the manifold unless one of 
the mass parameters be a holomorphic 2-form w. 
(Incidentally, this is the origin of the constraint 
þh: 40 mentioned above.) This spatially depen- 
dent mass term vanishes where w does, and we will 
assume as in Vafa and Witten (1994) and Witten 
(1994b) that w vanishes with multiplicity 1 on 
a union of disjoint, smooth complex curves 
Cii-—l,...," of genus g; which represent the 
canonical divisor K of X. The vanishing of 
w introduces corrections involving K whose precise 
form is not known a priori. In the G=SU(2) 
case, each of the M — 1 vacua bifurcates along each 
of the components C; of the canonical divisor 
into two strongly coupled massive vacua. This 
vacuum degeneracy is believed to stem from the 
spontaneous breaking of a Z chiral symmetry 
which is unbroken in bulk (see, e.g., Vafa and 
Witten (1994) and Witten (1994b)). 

The structure of the corrections for G —SU(N) 
(see [19] below) suggests that the mechanism at 
work in this case is not chiral symmetry breaking. 


124 Duality in Topological Quantum Field Theory 


Indeed, near any of the Crs, there is an N-fold 
bifurcation of the vacuum. A plausible explana- 
tion for this degeneracy could be found in the 
spontaneous breaking of the center of the gauge 
group (which for G=SU(N) is precisely Zy). In 
any case, the formula for SU(N) can be computed 
(at least when N is prime) along the lines 
explained by Vafa and Witten (1994) and assum- 
ing that the resulting partition function satisfies a 
set of nontrivial constraints which are described 
below. 

Then, for a given 't Hooft flux v € H?(X, Zn), the 
partition function for gauge group SU(N) (with 
prime N) is given by 
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where a= exp (2zi/N), G(q) 2 n(q)* (with n(q) the 


Dedekind function), x) are the SU(N) characters at 
level 1 and x,,4 are certain linear combinations 
thereof. [Cj], is the reduction modulo N of the 
Poincaré dual of C;, and 
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where ¢;=0,1,..., N — 1 are chosen independently. 
Equation [19] has the expected properties under 
the modular group: 
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and also, with Zsu) =N®-1Zo and ZSU(N)/Zn = 


2s Ly; 
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which is, up to some correction factors that vanish in 
flat space, the original Montonen-Olive conjecture! 
There is a further property to be checked which 
concerns the behavior of [19] under blow-ups. This 
property was heavily used by Vafa and Witten 
(1994) and demanding it in the present case was 


essential in deriving the above formula. Blowing up 
a point on a Kahler manifold X replaces it with a 
new Kahler manifold X whose second cohomology 
lattice is H?(X, Z) = H*(X, Z) I^, where I~ is the 
one-dimensional lattice spanned by the Poincaré 
dual of the exceptional divisor B created by the 
blow-up. Any allowed ZN flux v on X is of the form 


v=v@r, where v is a flux in X and r—AB, 


A—0,1,...,N —1. The main result concerning 
[19] is that under blowing up a point on a Kahler 
4-manifold with canonical divisor as above, the 
partition functions for fixed °t Hooft fluxes have a 
factorization as 


Zs(70) = Zx (7) xu) [23] 
Precisely the same behavior under blow-ups of the 
partition function [19] has been proved for the 
generating function of Euler characteristics of 
instanton moduli spaces on Kähler manifolds. This 
should not come as a surprise since, as mentioned 
above, on certain 4-manifolds, the partition function 
of Vafa-Witten theory computes the Euler charac- 
teristics of instanton moduli spaces. Therefore, [19] 
can be seen as a prediction for the Euler numbers of 
instanton moduli spaces on those 4-manifolds. 

Finally, the third twisted N =4 theory also pos- 
sesses two scalar supercharges, and is believed to be a 
certain deformation of the four-dimensional BF 
theory, and as such it describes essentially intersection 
theory on the moduli space of complexified gauge 
connections. In addition to this, the theory is “amphi- 
cheiral,” which means that it is invariant to a reversal 
of the orientation of the spacetime manifold. The 
terminology is borrowed from knot theory, where an 
oriented knot is said to be amphicheiral if, crudely 
speaking, it is equivalent to its mirror image. From this 
property, it follows that the topological invariants of 
the theory are completely independent of the complex- 
ified coupling constant 7. 


See also: Donaldson-Witten Theory; Electric-Magnetic 
Duality; Hopf Algebras and q-Deformation Quantum 
Groups; Large-N and Topological Strings; Seiberg— 
Witten Theory; Topological Quantum Field Theory: 
Overview. 
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Introduction 


The relations between thermodynamics and 
dynamics are dealt with by statistical mechanics. 
For a given dynamical system of Hamiltonian type 
in a classical framework, it is usually assumed that a 
dynamical foundation for equilibrium statistical 
mechanics, namely for the use of the familiar 
Gibbs ensembles, is guaranteed if one can prove 
that the system is ergodic, that is, has no integrals of 
motion apart from the Hamiltonian itself. One of 
the main consequences is then that classical 
mechanics fails in explaining thermodynamics at 
low temperatures (e.g., the specific heats of crystals 
or of polyatomic molecules at low temperatures, or 
the black body problem), because the classical 
equilibrium ensembles lead to equipartition of 
energy for a system of weakly coupled oscillators, 
against Nernst’s third principle. This is actually the 
problem that historically led to the birth of quantum 
mechanics, equipartition being replaced by Planck’s 
law. At a given temperature T, the mean energy of 
an oscillator of angular frequency w is not kpT (kp 
being the Boltzmann constant), and thus is not 
independent of frequency (equipartition), but 


decreases to zero exponentially fast as frequency 
increases. 

Thus, the problem of a dynamical foundation for 
classical statistical mechanics would be reduced to 
ascertaining whether the Hamiltonian systems of 
physical interest are ergodic or not. It is just in this 
spirit that many mathematical works were recently 
addressed at proving ergodicity for systems of hard 
spheres, or more generally for systems which are 
expected to be not only ergodic but even hyperbolic. 
However, a new perspective was opened in the year 
1955, with the celebrated paper of Fermi, Pasta, and 
Ulam (FPU), which constituted the last scientific 
work of Fermi. 

The FPU paper was concerned with numerical 
computations on a system of N (actually, 32 or 64) 
equal particles on a line, each interacting with the 
two adjacent ones through nonlinear springs, certain 
boundary conditions having been assigned (fixed 
ends). The model mimics a one-dimensional crystal 
(or also a string), and can be described in the 
familiar way as a perturbation of a system of N 
normal modes, which diagonalize the corresponding 
linearized system. The initial conditions corre- 
sponded to the excitation of only a few low- 
frequency modes, and it was expected that energy 
would rather quickly flow to the high-frequency 
modes, thus establishing equipartition of energy, in 
agreement with the predictions of classical equili- 
brium statistical mechanics. But this did not occur 
within the available computation times, and the 
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energy rather appeared to remain confined within a 
packet of low-frequency modes having a certain 
width, as if being in a state of apparent equilibrium 
of a nonstandard type. This fact can be called “the 
FPU paradox.” In the words of Ulam, written as a 
comment in Fermi’s Collected Papers, this is 
described as follows: “The results of the computa- 
tions were interesting and quite surprising to Fermi. 
He expressed the opinion that they really constituted 
a little discovery in providing intimations that the 
prevalent beliefs in the universality of mixing and 
thermalization in nonlinear systems may not be 
always justified." 

The FPU paper immediately had a very strong 
impact on the theory of dynamical systems, because 
it motivated all the modern theory of infinite- 
dimensional integrable systems and solitons (KdV 
equation), starting from the works of Zabusky and 
Kruskal (1965). But in this way the FPU paradox 
was somehow enhanced, because the FPU system 
turned out to be associated to the class of 
integrable systems, namely the systems having a 
number of integrals of motion equal to the number 
of degrees of freedom, which are in a sense the 
most antithermodynamic systems. The merit of 
establishing a bridge towards ergodicity goes to 
Izrailev and Chirikov (1966). Making reference to 
the most advanced results then available in the 
perturbation theory for nearly integrable systems 
(KAM theory), these authors pointed out that 
ergodicity, and thus equipartition, would be recov- 
ered if one took initial data with a sufficiently large 
energy. And this was actually found to be the case. 
Moreover, it turned out that their work, and its 
subsequent completion by Shepelyanski, was often 
interpreted as supporting the conjecture that the 
FPU paradox would disappear in the thermody- 
namic limit (infinitely many particles, with finite 
density and energy density). The opposite conjec- 
ture was advanced in the year 1970 by Bocchieri, 
Scotti, Bearzi, and Loinger, and its relevance for the 
relations between classical and quantum mechanics 
was immediately pointed out by  Cercignani, 
Galgani, and Scotti. A long debate then followed. 
Possibly, some misunderstandings occurred, because 
in the discussions concerning the dynamical aspects 
of the problem reference was generally made to 
notions involving infinite times. In fact, it had not 
yet been conceived that the FPU equilibrium might 
actually be an apparent one, corresponding to some 
type of intermediate metaequilibrium state. This 
was for the first time suggested by researchers in 
Parisi’s group in the year 1982. The analogy of 
such a situation with that occurring in glasses was 
pointed out more recently. 


In the present article, the state of the art of the 
FPU problem is discussed. The thesis of the present 
authors is that the FPU phenomenon survives in the 
thermodynamic limit, in the last mentioned sense, 
namely that at sufficiently low temperatures there 
exists a kind of metaequilibrium state surviving for 
extremely long times. The corresponding thermo- 
dynamics turns out to be different from the standard 
one predicted by the equilibrium ensembles, inas- 
much as it presents qualitatively some quantum-like 
features (typically, specific heats in agreement with 
Nernst’s third principle). The key point, with respect 
to equilibrium statistical mechanics, is that the 
internal thermodynamic energy should be identified 
not with the whole mechanical energy, but only with 
a suitable fraction of it, to be identified through its 
dynamical properties, as was suggested more than a 
century ago by Boltzmann himself, and later by 
Nernst. 

Here, it is first discussed why nearly integrable 
systems can be expected to present the FPU phenom- 
enon. Then the latter is illustrated. Finally, some hints 
are given for the corresponding thermodynamics. 


Nearly Integrable versus Hyperbolic 
Systems, and the Question of the Rates 
of Thermalization 


As mentioned above, it is usually assumed that the 
problem of providing a dynamical foundation to 
classical statistical mechanics is reduced to the 
mathematical problem of ascertaining whether the 
Hamiltonian systems of physical interest are ergodic 
or not. However, there remains open a subtler 
problem. Indeed, the notion of ergodicity involves 
the limit of an infinite time (time averages should 
converge to ensemble averages as £ — oc), while 
intermediate times might be relevant. In this 
connection it is convenient to distinguish between 
two classes of dynamical systems, namely the 
hyperbolic and the nearly integrable ones. 

The first class, in a sense the prototype of chaotic 
systems, should include the systems of hard spheres 
(extensively studied after the classical works of 
Sinai), or more generally the systems of mass points 
with mutual repulsive interactions. For such systems 
it can be expected that the time averages of the 
relevant dynamical quantities in an extremely short 
time converge to the corresponding ensemble 
averages, so that the classical equilibrium ensembles 
could be safely used. 

A completely different situation. occurs for the 
dynamical systems such as the FPU systems, which 
are nearly integrable, that is, are perturbations of 


systems having a number of integrals of motion 
equal to the number of degrees of freedom. Indeed, 
in such a case ergodicity means that the addition of 
an interaction, no matter how small, makes an 
integrable system lose all of its integrals of motion, 
apart from the Hamiltonian itself. And, in fact, this 
quite remarkable property was already proved to be 
generic by Poincaré, through a set of considerations 
which had a fundamental impact on the theory of 
dynamical systems itself. In view of its importance 
for the foundations of statistical mechanics, the 
proof given by Poincaré was reconsidered by Fermi, 
who added a subtle contribution concerning the role 
of single invariant surfaces. It is just to such a paper 
that Ulam makes reference in his comment to the 
FPU work mentioned above, when he says: “Fermi’s 
earlier interest in the ergodic theory is one motive” 
for the FPU work. 

The point is that the picture which looks at the 
ergodicity induced on an integrable system by the 
addition of a perturbation, no matter how small, 
somehow lacks continuity. One might expect that, 
in situations in which the nonlinear interaction 
which destroys the integrals of motion is very small 
(i.e., at low temperatures), the underlying integrable 
structure should somehow be still appreciable, in 
some continuous way. In fact, continuity should be 
recovered by making a question of times, namely by 
considering the rates of thermalization (to use the 
very FPU phrase), or equivalently the relaxation 
times, namely the times needed for the time averages 
of the relevant dynamical quantities to converge to 
the corresponding ensemble averages. By continuity, 
one clearly expects that the relaxation times diverge 
as the perturbation tends to zero. But more 
complicated situations might occur, as, for example, 
the existence of two (or more) relevant timescales. 
The point of view that timescales of different orders 
of magnitude might occur in dynamical systems 
(with the exhibition of an interesting example) and 
that this might be relevant for statistical mechanics, 
was discussed by Poincaré himself in the year 1906. 
Indeed, he denotes as “first-order very large time” a 
time which is sufficient for a system to reach a 
“provisional equilibrium,” whereas he denotes as 
“second-order very large time” a o" which is 
necessary for the system to reach its "definitive 
equilibrium." 


The FPU Phenomenon: Historical and 
Conceptual Developments 


We now illustrate the FPU phenomenon, following 
essentially its historical development. We will make 
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reference to Figures 1-8, which are the results of 
numerical integrations of the FPU dynamical system. 
If x1, ..., xw denote the positions of the particles (of 
unitary mass), or more precisely the displacements 
from their equilibrium positions, and p; the corre- 
sponding momenta, the Hamiltonian is 


N+] 


H= Y voa 


where r;—x; —x; 4 and one has taken a potential 
V(r) 2 * /2 + ar? /3-- 8r* /A depending on two 
positive parameters œ and 8. Boundary conditions 
with fixed ends, namely xo =xyn+1 — 0, are consid- 
ered. We recall that the angular frequencies 
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Figure 1 The FPU paradox: normal-mode energies E; versus 
time (left) and energy spectrum, namely time average of E; 
versus j (right) for three different timescales. The energy, initially 
given to the lowest-frequency mode, does not flow to the high- 
frequency modes within the accessible observation time. Here, 
N — 32 and E — 0.05. 
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Figure 2 The FPU paradox: time averages of the energies of 
the modes 1, 2,..., 8 (from top to bottom) versus time for the 
same run as Figure 1. The spectrum has reached an apparent 
equilibrium, different from that of equipartition predicted by 
classical equilibrium statistical mechanics. An exponential 
decay of the tail is clearly exhibited. 


of the corresponding normal modes are w;= 2sin 
[/7/2(N + 1)], with ;—1,..., N; it is thus conve- 
nient to take as time unit the value m, which is 
essentially, for any N, the period of the fastest 
normal mode. 

The original FPU result is illustrated in Figures 1 
and 2. Here N—32,«—08-—1/4, and the total 
energy is E— 0.05; the energy was given initially to 
the first normal mode (with vanishing potential 
energy). Three timescales (increasing from top to 
bottom) are considered, the top one corresponding 
to the timescale of the original FPU paper. In the 
boxes on the left the energies E;(t) of modes j are 
reported versus time (/—1,...,8 at top, j=1 at 
center and bottom). In the boxes on the right we 
report the corresponding spectra, namely the time 
average (up to the respective final times) of the 
energy of mode j versus j, for 1 < j < N. In Figure 2 
we report, for the same run of Figure 1, the time 
averages of the energies of the various modes versus 
time; this figure corresponds to the last one of the 
original FPU work. The facts to be noticed 
connection with these two figures are the following: 
(1) the spectrum (namely the distribution of energy 
among the modes, in time average) appears to have 
relaxed very quickly to some form, which remains 
essentially unchanged up to the maximum observed 
time; (2) there is no global equipartition, but only a 
partial one, because the energy remains confined 
within a group of low-frequency modes, which form 
a small packet of a certain definite width; and (3) 
the time evolutions of the mode energies appear to 
be of quasiperiodic type, since longer and longer 
quasiperiods can be observed as the total time 
increases. 
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Figure 3 The Izrailev-Chirikov contribution: for a fixed obser- 


vation time, equipartition is attained if the initial energy E is high 
enough. Here, from top to bottom, E — 0.1, 1, 10. 
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Figure 4 The Izrailev—Chirikov contribution: time averages of 


the mode energies versus time for the same run as at bottom of 
Figure 3. 


After the works of Zabusky and Kruskal, by 
which the FPU system was somehow assimilated to 
an integrable system, the bridge toward ergodicity 
was made by Izrailev and Chirikov (1966), through 
the idea that there should exist a stochasticity 
threshold. Making reference to KAM theory, which 
had just been formulated in the framework of 
perturbation theory for nearly integrable systems, 
their main remark was as follows. It is known that 
KAM theory, which essentially guarantees a beha- 
vior similar to that of an integrable system, applies 
only if the perturbation is smaller than a certain 
threshold; on the other hand, in the FPU model the 
natural perturbation parameter is the energy E of 
the system. Thus, the FPU phenomenon can be 
expected to disappear above a certain threshold 
energy Ee. This is indeed the case, as illustrated in 
Figures 3 and 4. The parameters a, 8 and the class of 
initial data are as in Figure 1. In Figure 3 the total 
time is kept fixed (at 10 000 units), whereas the 
energy E is increased in passing from top to bottom, 
actually from E —0.1 to E— 1 and E= 10. One sees 
that at E— 10 equipartition is attained within the 
given observation time; correspondingly, the motion 
of the modes visually appears to be nonregular. The 
approach to equipartition at E=10 is clearly 
exhibited in Figure 4, where the time averages of 
the energies are reported versus time. 

There naturally arose the problem of the depen- 
dence of the threshold E. on the number N of degrees 
of freedom (and also on the class of initial data). 
Certain semianalytical considerations of Izrailev and 
Chirikov were generally interpreted as suggesting 
that the threshold should vanish in the thermody- 
namic limit for initial excitations of high-frequency 
modes. Recently, Shepelyanski completed the analy- 
sis by showing that the threshold should vanish also 
for initial excitations of the low-frequency modes, as 
in the original FPU work (see, however, the 
subsequent paper by Ponno mentioned below). If 
this were true, the FPU phenomenon would dis- 
appear in the thermodynamic limit. In particular, 
the equipartition principle would be dynamically 
justified at all temperatures. 

The opposite conjecture was advanced by 
Bocchieri et al. (1970). This was based on numerical 
calculations, which indicated that the energy thresh- 
old should be proportional to N, namely that the FPU 
phenomenon persists in the thermodynamic limit 
provided the specific energy c— E/N is below a 
critical value e, which should be definitely nonvan- 
ishing. Actually, the computations were performed 
on a slightly different model, in which nearby 
particles were interacting through a more physical 
Lennard-Jones potential. By taking concrete values 
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having a physical significance, namely the values 
commonly assumed for argon, for the threshold of 
the specific energy they found the value ee ^ 0.04 Vo, 
where Vo is the depth of the Lennard-Jones potential 
well. This corresponds to a critical temperature of the 
order of a few kelvin. The relevance of such a 
conjecture (persistence of the FPU phenomenon in the 
thermodynamic limit) was soon strongly emphasized 
by Cercignani, Galgani, and Scotti, who also tried to 
establish a connection between the FPU spectrum and 
Planck's distribution. 

Up to this point, the discussion was concerned 
with the alternative whether the FPU system is 
ergodic or not, and thus. reference was made to 
properties holding in the limit t — oc. Correspond- 
ingly, one was making reference to KAM theory, 
namely to the possible existence of surfaces (N- 
dimensional tori) which should be dynamically 
invariant (for all times). The first paper in which 
attention was drawn to the problem of estimating 
the relaxation times to equilibrium was by Fucito 
et al. (1982). The model considered was actually a 
different one (the so-called $+ model), but the results 
can also be extended to the FPU model. Analytical 
and numerical indications were given for the 
existence of two timescales. In a short time the 
system was found to relax to a state characterized by 
an FPU-like spectrum, with a plateau at the low 
frequencies, followed by an exponential tail. This, 
however, appeared as being a sort of metastable 
state. In their words: *The nonequilibrium spectrum 
may persist for extremely long times, and may be 
mistaken for a stationary state if the observation 
time is not sufficiently long." Indeed, on a second 
much larger timescale the slope of the exponential 
tail was found to increase logarithmically with time, 
with a rate which decreases to zero with the energy. 
This is an indication that the time for equipartition 
should increase as an exponential with the inverse of 
the energy. 

This is indeed the picture that the present authors 
consider to be essentially correct, being supported 
by very recent numerical computations, and by 
analytical considerations. Curiously enough, how- 
ever, such a picture was not fully appreciated until 
quite recently. Possibly, the reason is that the 
scientific community had to wait until becoming 
acquainted with two relevant aspects of the theory 
of dynamical systems, namely Nekhoroshev theory 
and the relations between KdV equation and 
resonant normal-form theory. 

The first step was the passage from KAM theory 
to Nekhoroshev theory. Let us recall that, whereas 
in KAM theory one looks for surfaces which are 
invariant (for all times), in Nekhoroshev theory one 
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looks instead for a kind of weak stability involving 
finite times, albeit “extremely long” ones, as they 
are found to increase as stretched exponentials with 
the inverse of the perturbative parameter. Thus, one 
meets with situations in which one can have 
instability over infinite times, while having a kind 
of practical stability up to exponentially long times. 
Notice that Nekhoroshev’s theory was formulated 
only in the year 1974, and that it started to be 
known in the West only in the early 1980s, just 
because of its interest for the FPU problem. Another 
interesting point is that just in those years one 
started to become acquainted with a related histor- 
ical fact. Indeed, the idea that equipartition might 
require extremely long times, so that one would be 
confronted with situations of a practical lack of 
equipartition, has in fact a long tradition in 
statistical mechanics, going back to Boltzmann and 
Jeans, and later (in connection with sound disper- 
sion in gases of polyatomic molecules) to Landau 
and Teller. 

In this way the idea of the existence of extremely 
long relaxation times to equipartition came to be 
accepted. The ingredient that was still lacking is the 
idea of a quick relaxation to a metastable state. The 
importance of this should not be overlooked. 
Indeed, without it one cannot at all have a 
thermodynamics different from the standard equili- 
brium one corresponding to equipartition. This was 
repeatedly emphasized, against Jeans, by Poincaré 
on general grounds and by Nernst on empirical 
grounds. The full appreciation of this latter ingre- 
dient was obtained quite recently (although it had 
been clearly stated by Fucito et al. (1982)). A first 
hint in this direction came from the realization 
(see Figure 5) of a deep analogy between the FPU 
phenomenon and the phenomenology of glasses. 
Then there came a strong numerical indication by 
Berchialla, Galgani, and Giorgilli. Finally, from the 
analytical point of view, there was a suitable 
revisitation (by Ponno) of the traditional connection 
between the FPU system and the KdV equation 
with its solitons. The relevant points are the 
following: (1) the KdV equation describes well the 
solutions of the FPU problem (for initial data of FPU 
type) only on a “short” timescale, which increases as 
a power of 1/e, and so describes only a first process 
of quick relaxation; (2) the corresponding spectrum 
has a very definite analytical form, the energy being 
spread up to a maximal frequency (ce) ~ €!/* and 
then decaying exponentially; and (3) the relevant 
formulas contain the energy only through the 
specific energy €, and thus can be expected to hold 
also in the thermodynamic limit. It should be 
mentioned, however, that all the results of an 
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Figure 5 Analogy with glasses: the specific energy u of an 
FPU system is plotted versus temperature T for a cooling 
process (upper curve) and a heating process (lower curve). 
The FPU system is kept in contact with a heat reservoir, whose 
temperature is changed at a given rate. At low temperatures the 
system does not have time to reach the equilibrium curve u — T 
(with kg = 1). 


analytic type mentioned above have a purely formal 
character, because up to now none of them was 
proved, in the thermodynamic limit, in the sense of 
rigorous perturbation theory. This requires a suita- 
ble readaptation of the known techniques, which is 
currently being obtained both in connection with 
Nekhoroshev’s theorem (in order to explain the 
extreme slowness of a possible final approach to 
equilibrium) and in connection with the normal- 
form theory for partial differential equations (in 
order to explain the fast relaxation to the metaequi- 
librium state). 

In conclusion, for the case of initial conditions of 
the FPU type (excitation of a few low-frequency 
modes) the situation seems to be as follows. The first 
phenomenon that occurs in a “short” time (of the 
order of (1/e)*/* is a quick relaxation to the 
formation of what can be called a “natural packet” 
of low-frequency modes extending up to a certain 
maximal frequency à ~ e!/*. This is a phenomenon 
which has nothing to do with any diffusion in phase 
space. In fact, it shows up also for an integrable 
system such as a Toda lattice (as will be illustrated 
below), and should be described by a suitable 
resonant normal form related to the KdV equation. 
One has then to take into account the fact that the 
domain of the frequencies in the FPU model is 
bounded (w < 2 in the chosen units). Now, as the 
function w(e) is monotonic, this fact leads to the 
existence of a critical value e, of the specific energy 
e, defined by à(«.) — 2. Indeed, for € > ec the quick 
relaxation process leads altogether to equipartition. 
Below the threshold, instead, the same quick process 
leads to the formation of an FPU-like spectrum, 
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Figure 6 Time needed to form a packet versus specific energy 
for the FPU model (bottom) and the corresponding Toda model 
(top). Different symbols refer to packets of different width. The 
existence of two timescales below a critical specific energy in the 
FPU model is exhibited. 


involving only modes of sufficiently low frequency. 
This should, however, be a metastable state (which 
might be mistaken for a stationary one), which 
should be followed, on a second timescale, by a 
relaxation to the final equilibrium, through a sort 
of Arnol'd diffusion requiring extremely long 
Nekhoroshev-like times. This is actually the way in 
which the old idea of a threshold, originally 
conceived in terms of KAM tori, is now recovered 
even for ergodic systems, in terms of timescales. 
The existence of a process of quick relaxation, 
and of a threshold in the above-mentioned sense, is 
illustrated in Figures 6 and 7. In Figure 6 the lower 
part refers to the FPU model, while the upper one 
refers to a corresponding Toda model. The latter is 
in a sense the prototype of an integrable nonlinear 
system; with respect to the FPU case, the difference 
is that the potential V(r) is now exponential. The 
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Figure 7 Width of the natural packet versus specific energy, 
for N ranging from 8 to 1023. Reproduced from Berchialla L, 
Galgani L, and Giorgilli A (2004) Localization of energy in FPU 
chains, Discr. Cont. Dyn. Systems B 11: 855-866, with 
permission from American Institute of Mathematical Sciences. 


parameters of the exponential were chosen so that 
the two models coincide up to cubic terms in the 
potential. With the energy given to the lowest- 
frequency mode, the figure shows the time needed in 
order that energy spreads up to a mode k, for several 
values of k, as a function of c. It is seen that in the 
Toda model (top) there is formed a packet extending 
up to rather well-defined width, and that this occurs 
within a relaxation time increasing as a power of 
1/e. An analogous phenomenon occurs for the FPU 
model (bottom). The only difference is that, below a 
critical specific energy €,- 0.1, there exists a 
subsequent relaxation time to equipartition, which 
involves a time growing faster than any inverse 
power of e. Such a second phenomenon is due to the 
nonintegrable character of the FPU model. In 
Figure 7 the width of the natural packet for the 
FPU model is exhibited, by reporting the frequency 
© of its highest mode as a function of e. As one sees, 
the numerical results clearly indicate the existence of 
a relation © ~ el/4, which holds for a number of 
degrees of freedom N ranging from 8 to 1023. This 
is actually the law which is predicted by resonant 
normal-form theory. 


Boltzmann and Nernst Revisited 


All the results illustrated above refer to initial data 
of FPU type, namely with an excitation of a few 
low-frequency modes. However, from the point of 
view of statistical mechanics, such initial data are 
exceptional, and one should rather consider initial 
data extracted from the Gibbs distribution at a 
certain temperature. One can then couple the FPU 
system to a heat bath at a slightly different 
temperature, and look at the spectrum of the FPU 
system after a certain time. The result, for the case 
of a heat bath at a higher temperature, is shown in 
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Figure 8 A case of an FPU system initially at equilibrium and 
thus in equipartition. Spectrum of the FPU system after it was 
kept in contact with a heat reservoir at a higher temperature. 


Figure 8. Clearly, here one has a situation similar to that 
occurring for initial data of FPU type, because only a 
packet of low-frequency modes exhibits a reaction, each 
of its modes actually adapting itself to the temperature 
of the bath, whereas the high-frequency modes do not 
react at all, that is, remain essentially frozen. 

This capability of reacting to external disturbances 
(which seems to pertain only to a fraction of the 
mechanical energy initially inserted into the system) 
can be characterized in a quantitative way through an 
estimate of the fluctuations of the total energy of the 
FPU system. This is indeed the sense of the fluctuation- 
dissipation theorem, the precursor of which is perhaps 
the contribution of Einstein to the first Solvay 
conference (1911). Through such a method, the 
specific heat of the FPU system is estimated (apart 
from a numerical factor) by the time average of [E(t) — 
E(0)|*, where E(t) is the energy, at time £, of the FPU 
system in dynamical contact with a heat bath (at the 
same temperature from which the initial data are 
extracted). Usually, in the spirit of ergodic theory, one 
looks at the infinite-time limit of such a quantity. But 
in the spirit of the metastable picture described above, 
one can check whether the time average presents a 
previous stabilization to some value smaller than the 
one predicted at equilibrium. Such a result, which is in 
qualitative agreement with the third principle, has- 
indeed been obtained (by Carati and Galgani) recently. 

In conclusion, in situations of metaequilibrium such 
as those existing in the FPU model at low tempera- 
tures, a thermodynamics can still be formulated. 
Indeed, by virtue of the quick relaxation process 
described above, the time averages of the relevant 
quantities are found to stabilize in rather short times. 
In this way, one overcomes the critique of Poincaré to 
Jeans, namely that one cannot have a thermodynamics 
at all if reference is made only to the existence of 


extremely long relaxation times to the final equili- 
brium. A relaxation to a “provisional equilibrium” 
within a “first-order very large time” (to quote 
Poincaré) is required . The difference with respect to 
the standard equilibrium thermodynamics relies now 
in the mechanical interpretation of the first principle. 
Indeed, the internal thermodynamic energy is identi- 
fied not with the whole mechanical energy, but just 
with that fraction of it which is capable of reacting in 
short times to the external perturbations. 

This is the way in which the old idea of Boltzmann 
(and Jeans) might perhaps be presently implemented. 
For what concerns the fraction of the mechanical energy 
which is not included in the thermodynamic internal 
energy, as not being able-to react in relatively short 
times, this should somehow play the role of a zero-point 
energy. This was suggested in the year 1971 by C 
Cercignani. But in fact, such a concept was put forward 
by Nernst himself in an extremely speculative work in 
1916, where he also advanced the concept that, for a 
system of oscillators of a given frequency, there should 
exist both dynamically ordered (geordnete) and dyna- 
mically chaotic (ungeordnete) motions, the latter being 
prevalent above a certain energy threshold. According to 
him, this fact should be relevant for a dynamical 
understanding of the third principle and of Planck's law. 

It is well known that the modern theory of 
dynamical systems has led to familiarity with the 
(sometimes abused) notions of order and chaos and of 
a transition between them. One might say that the FPU 
work just forced the scientific community to take into 
account such notions in connection with the principle 
of equipartition of energy. It is really fascinating to see 
that the same notions, with the same terminology, had 
already been introduced much earlier on purely 
thermodynamic grounds, in connection with the 
relations between classical and quantum mechanics. 
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Introduction 


The purpose of this article is to describe some basic 
problems related to the interplay between dynamical 
systems and mathematical physics. Since it is 
impossible to be exhaustive in these topics, the 
focus here is on water-wave models. These mathe- 
matical models are described by partial differential 
equations that can be understood as dynamical 
systems in a suitable infinite-dimensional phase 
space. 

We will not address the original equations for 
two-dimensional (2D) surface water waves, even if 
we know that dynamical-system methods can help 
to exhibit some solitary waves for the equations. 
The reader is referred to relevant articles in this 
encyclopedia for details. Another approach is to 
seek these 2D surface water waves as saddle points 
for some Hamiltonians, which too is discussed 
elsewhere in this work. 

This article presents these arguments on some 
asymptotical models for the propagation of surface 
water waves. 


Asymptotical Models in Hydrodynamics 


To begin with, consider an irrotational fluid in a 
canal that is governed by the Euler equations and 


that is subject to gravitational forces. For a canal of 
finite depth, Boussinesq (1877) and Korteweg-de 


Vries (KdV) (1890) obtained the following model 
for unidirectional long waves: 


Uy + ty + Mexx + uu, = 0 [1] 


Sometimes we drop the ux term on the left-hand side 
of [1], thanks to a suitable change of coordinates. 
Alternatively, we can also deal with the so-called 
generalized KdV equation, which reads 


Ht Maxx + uu. = 0 [2] 


where k is a positive integer. There are also other 
models designed to represent long waves in shallow 
water. Let us introduce the regularized long-wave 
equation (also referred to as the Benjamin-Bona- 
Mahony equation) that reads 


Ut — Urxx + Ux + uu. = 0 [3] 
or the Camassa-Holm equation 


Hs —= WU tex + 3uu, — 2M yx T HH xxx [4] 


For deep water, a well-known model was intro- 
duced by Zakharov (1968) 


IH; + Hex +E u| u =f [5] 


which describes the slow modulations of wave 
packets. Here the unknown u(x, t) takes values in 
C, and this nonlinear Schródinger equation is in 
fact a system. In these equations, £ is either 1 or 
—]; throughout this article, we shall refer to the 
former case as the focusing case and to the latter 
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as the defocusing case. We may also substitute 
ul Pu in the nonlinear term in [5] to obtain 
alternate models. 

The variable t represents the time and the space 
variable x belongs either to R or to a finite interval 
when we are dealing with periodic flows. 

The above models are intended to describe the 
propagation of unidirectional waves. For two-way 
waves, see Bona et al. (2002). 

Actually, these equations feature particular solu- 
tions, the so-called traveling waves. Let us recall, for 
instance, that for generalized KdV equation [2] these 
solutions are 


u(t,x) = O-(x — ct) [6| 
O.(x) = e" Q(/ex) [7] 
Q(x) = (3ch *(px))'/? [8] 


These so-called solitons (Figure 1) move to the right 
without changing their shape; c is the speed of 
propagation. In real life, this phenomenon was 
observed by Russel (1834). Riding his horse, he 
was able to follow for miles the propagation of such 
a wave on the canal from Edinburgh to Glasgow. 
On the other hand, Camassa-Holm equations are 
designed to describe the propagation of peaked 
solitons as shown in Figure 2. 

Focusing nonlinear Schródinger equations also 


feature solitary waves that read u,(t,x)= 
exp(iwt)O(x), where O is solution to 
Oxx — WQ + gre =0 [9] 


Figure 1 A soliton. 


Figure 2 Peaked soliton. 


There are numerous examples of equations or 
systems of equations that model 2D surface water 
waves. Among all these models, a first issue is to 
identify the relevant models insofar as the dynamical 
properties are concerned. Indeed, we address here 
the question of stability of solitary waves (up to the 
symmetries of the equation). For instance, the 
orbital stability for cubic Schrödinger reads: for 
any £ > 0, there exists a neighborhood € of u,,(x, 0) 
such that any trajectory starting from €? satisfies 


sup;infpinfy||u(t) — exp(i@)u,(t,.—y)||pi <£ [10] 


Another issue consists in the interaction of N 
solitons. Schneider and Wayne (2000) have 
addressed the issue of the validity of water-wave 
models when this interaction is concerned. 

Assume now that the validity of these models is 
granted. To consider [1] or [5] as a dynamical 
system, the next issue is then to consider the initial- 
value problem. 


The Initial-Value Problem 


Let us supplement these equations with initial 
data uo in some Sobolev space. We shall consider 
either 


H*R) = L [O cec ac < +o} 1 


in the case where x belongs to the whole line, or the 
corresponding Sobolev space with periodic bound- 
ary conditions. It should be examined whether these 
equations provide a continuous flow S(t):uo — u(t) 
in these functional spaces (at least locally in time). 
We would like to point out that for each Sobolev 
space under consideration, we may have a different 
flow. This fact is at the heart of infinite-dimensional 
dynamical systems. 

The initial-value problem was a challenge for 
decades for low norms, that is for small s. The last 
breakthrough was performed by Bourgain (1993). 
Let us present the method for KdV equation. 
Consider U(t)uo the solution of the Airy equation 

Uy + Uxx, — 0, u(0) = uo [12] 
Without going into further details, the idea is to 
perform a fixed-point argument to the Duhamel’s 
form of the equation, 


u(t) = U(t)uo 一 J U(t —s)O,(u?(s))ds [13] 
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in a suitable mixed-spacetime Banach space whose 
norm reads ||U(—t)u(t, x)|| ze- This relies on fine 
properties in harmonic analysis. Thanks to this 
method, we know that the Schrödinger equation 
[5] and the KdV equation [1] are well-posed in, 
respectively, H:(R),s >0 and H'*(R),s > —3/4, 
locally in time. For the periodic case, the results 
are slightly different. We would like to point out 
that both KdV and nonlinear Schródinger equations 
provide semigroups S(t) that do not feature smooth- 
ing effect. A trajectory that starts from H? remains 
in H‘; indeed, we can also solve these partial 
differential equations backward in time. 

The next issue is to determine if these flows are 
defined for all times. Loosely speaking, the follow- 
ing alternative holds true: either the local flow in FP? 
extends to a global one, or some blow-up phenom- 
enon occurs, that is, ||S(27)4o||,j collapses in finite 
time. 

To this end, let us observe that, for instance, the 
mass fg lu(x)|^ dx is conserved for both KdV and 
nonlinear Schródinger flows. Therefore, one can 
prove that the solutions in L^ are global in time. It is 
worthwhile to observe that the Bourgain method 
also provides some global existence results below 
the energy norm. 

Consider now the flow of the solutions in H!. The 
second invariant for nonlinear Schródinger equa- 
tions reads 


[ste 


Therefore, the local solutions in H' extend to global 
ones in the defocusing case (c = — 1). In the focusing 
case, the situation is more contrasted. The solution 
is global if the nonlinearity is less than an H'-critical 
value (p —2 for Schrödinger, and k — 4 for general- 
ized KdV equation). This critical value depends on 
some Sobolev embeddings as 


: 1 u(x)| ^*^ dx [14] 


A (x) dx < Coll? lux? [15] 


Therefore, since the mass is constant, the second 
invariant controls the H! norm of the solution if 
p < 2. Note that the critical power of the nonlinear- 
ity depends also on the dimension of the space; it is 
the cubic Schrödinger that is critical in H ! (R?). It is 
well known that, for some initial data, blow-up 
phenomena can occur for 2D cubic Schrödinger 
equations. Moreover, the behavior of blow-up 
solutions is more or less understood. This analysis 
was performed using the conformal invariance of 
the equation. For quintic Schródinger equation, 


which is critical in 1D, this conformal invariance 
states that if u(t,x) is solution, then 


is also solution. 

On the other hand, for the generalized KdV 
equation, there is no conformal invariance and the 
blow-up issue had been open for years. There was 
some numerical evidence that blow-up can occur for 
k — 4. Recently, Martel and Merle (2002) have given 
a complete description of the blow-up profile for 
this equation. Their methods are quite complex and 
rely on an ejection of mass at infinity in a suitable 
coordinate system. 

In the discussion so far we have presented some 
quantities that are invariant by the flow of the 


solutions. This is related to the Hamiltonian 
structure of the dynamical systems under 
consideration. 


Hamiltonian Systems in Hydrodynamics 


The study of Hamiltonian systems has developed 
beyond celestial mechanics (the famous n-body 
problems) to other fields in mathematical physics. 
We focus here on dynamical systems that read 


o 
u =) H(u) 17] 


where H is the Hamiltonian and / some skew- 
symmetric operator. For instance, [1] is a Hamiltonian 
system with /—0O, (ie. an unbounded skew- 
symmetric operator) and 


H(u) =5/ -ujde c [wes [18] 


There is a subclass of Hamiltonian systems that 
are integrable by inverse-scattering methods. For 
instance, [1] belongs to this class. Indeed, these 
methods give a complete description of the asymp- 
totics when t — +oo. It is well known (Deift and 
Zhou 1993) that, asymptotically, any solution to 
KdV equation consists of a wave train moving to the 
right in the physical space up to a dispersive part 
moving to the left. 

On the other hand, a generic Hamiltonian system 
is not integrable. The study of the asymptotics and 
of the dynamical properties of such a system 
deserves another analysis. We say that a system 
features asymptotic completeness if there exist 44 
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and u— such that the solution u(t) of [17] supple- 
mented with initial data uo satisfies 


u(t) — U(t)u,|| > 0 [19] 


lut) — Utt)u- || ^5 0 [20] 


when, respectively, t 一 十 oo or t — —oo. Here 
U(t)uo is the solution of the free equation, that is, 
the associated linear equation, supplemented with 
initial data uo; for instance, the Airy equation is the 
free equation related to the KdV equation. The 
operators 4. — uo — u, are called wave opera- 
tors. This is related to the Bohr's transition in 
quantum mechanics. Loosely speaking, we are able 
to prove these scattering properties for high powers 
in the nonlinearity for subcritical defocusing Schró- 
dinger equations. 

The asymptotics of trajectories can be more 
complicated. Let us recall that the stability of 
traveling waves is also an important issue in under- 
standing the dynamical properties of these models. 
For instance, let us point out that Martel and Merle 
proved the asymptotic stability of the sum of N 
solitons for KdV in the subcritical case. 

Beyond these asymptotics we are interested in the 
case where the permanent regime is chaotic (or 
turbulent). A scenario is that there exist quasiper- 
iodic solutions of arbitrarily order N for the system 
under consideration. The next challenge about these 
Hamiltonian systems is to apply the Kolmogorov- 
Arnol'd-Moser theory to exhibit this type of 
solutions to systems like [17]. Here we restrict our 
discussion to the case of bounded domains, with 
either periodic or homogeneous Dirichlet conditions. 
Then, let us introduce the following definition: a 
solution is quasiperiodic if there exist a finite 
number N of frequencies wk such that 


N 
u(t, x) = S u(x) exp(iw;t) [21] 
l=1 


This extends the case of periodic solutions 
(N=1), which are isomorphic to the torus. To 
prove the existence of such structures, one idea is 
then to imbed N-dimensional invariant tori into the 
phase space of solutions. One may approximate the 
infinite-dimensional Hamiltonian by a sequence of 
finite ones and consider the convergence of iterated 
symplectic transformations, or one solves directly 
some nonlinear functional equation. Actually, the 
difficulty is that resonances can occur. Resonances 
occur when there are some linear combinations of 
the frequencies that vanish (or that are arbitrarily 
close to 0). This introduces a small divisor problem 


in a phase space that has infinite dimension. To 
overcome these difficulties, a Nash-Moser scheme 
can be implemented (Craig 1996). There are 
numerous such open problems. For instance, let us 
observe that known results are essentially only for 
the case where the dimension of the ambient space 
is 1. On the other hand, quasiperiodic solutions 
correspond to N-dimensional invariant tori for the 
flow of solutions; one may seek for Lagrangian 
invariant tori that correspond to the case where 
N= 十 co. Current research is directed towards 
extending this analysis. 

Another issue is to seek invariant measures for these 
Hamiltonian dynamical systems, as in statistical 
mechanics. Bourgain was successful in performing 
this analysis for some nonlinear Schrödinger equations 
either in the case of periodic boundary conditions or in 
the whole space. This result is an important step in the 
ergodic analysis of our Hamiltonian dynamical sys- 
tems. This could explain the Poincaré recurrence 
phenomena observed numerically for these types of 
equations: some particular solutions seem to come 
back to their initial state after a transient time. This 
point will not be developed here. 

All these results are properties of conservative 
dynamical systems. We now address the case when 
some dissipation takes place. 


Dissipative Water-Wave Models 


To model the effect of viscosity on 2D surface water 
waves, we go back to a flow governed by the 
Navier-Stokes equations and we proceed to obtain 
damped equations (Ott and Sudan 1970, Kakutani 
and Matsuuchi 1975). In fact, the damping in KdV 
equations can be either a diffusion term that leads to 
study the equation 


Uy + ux + MUS = UI. [22] 


where v is a positive number analogous to the 
viscosity, or a zero-order term 一 ZU on the right- 
hand side of [22]. In the first case, we obtain a 
KdV-Burgers equation that has some smoothing 
effect in time. In the second case, we have a zero- 
order dissipation term. A nonlocal term would be 
vF (e| A(E)) for BE [0,1], where F(u)=ú 
denotes the Fourier transform of z. 

A first issue concerning damped water-wave 
equations is to estimate the decay rate of the 
solutions towards the equilibrium (no decay) when 
t 一 +o. For [22] the ultimate result is that, for 
initial data uo € L'(R) n L^(R), the L4 norm of the 
solution decays like ¢7'/4 (Amick et al. 1989). 
Energy methods have been developed to handle 
these problems, as the Shonbeck's splitting method. 


U-w wc 
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The center manifold theory is another approach 
that is employed in dynamical systems. The aim is 
to prove the existence of a finite-dimensional 
manifold that is invariant (in a neighborhood of 
the origin) by the flow of the solutions and that 
attracts the other trajectories with high speed. 
Therefore, this manifold, and the trajectories 
therein, monitor the decay rate of the solutions 
towards the origin. The construction of such a 
manifold relies on splitting properties of the 
spectrum of the associated linearized operator 
(Gallay and Wayne 2002). Using a suitable change 
of variables (that moves the continuous spectrum 
away from the origin), Gallay and Wayne were able 
to construct such a manifold in an infinite-dimen- 
sional phase space. 

Another issue is the understanding of the dynamics 
for damped-forced water-wave equations as 


Uz + UUx + Uxxx + Vu = f(x) [23] 


The dynamical system approach is the attractor 
theory (Temam 1997). Equations such as [23] 
provide dissipative semigroups S(t) in some energy 
spaces. The theory has developed for years and we 
know that these dynamical systems feature global 
attractors. A global attractor is a compact subset in 
the energy space under consideration which is 
invariant by the flow of the solutions and that 
attracts all the trajectories when +t — +00. More- 
over, if we deal with periodic boundary conditions, 
this global attractor has finite fractal (or Hausdorff) 
dimension. This dimension depends on the data 
concerning v and f. 

Actually, eqn [23] provides semigroups either in 
L?(R),H'(R), or in H^(R). These three dynamical 
systems feature global attractors Ao, A1, A2. From 
the viewpoint of physics, the attractors describe the 
permanent regime of the flow. One may wonder if 
this permanent regime depends on the space chosen 
for the mathematical study. Eventually, the last 
result for this issue establish that Ao =A; = A2. This 
property is equivalent to prove the. asymptotical 
smoothing effect for the associated semigroup: even 
if S(t) is not a smoothing operator for finite t, then 
all solutions converge to a smooth set when t goes to 
the infinity. 

All these results are for subcritical nonlinearities. 
As already noted, dissipation provides smoothing at 
infinity. Nevertheless, damping does not prevent 
blow-up. Let us illuminate this by the following 
result due to Tsutsumi (1984). The damped Schró- 
dinger equation 


iu, + ivu + Uy, + |u| ^u = 0 [24] 


features blow-up solutions in H! (R) for p > 2, even 
if all solutions are damped in L^(R) with exponen- 
tial speed. 

This completes the discussion of damped-forced 
water-wave equations. We now consider equations 
that are forced with a random forcing term. 


Stochastic Water-Wave Models 


During the modeling process that led to KdV or 
Schródinger equations from Euler equation, we have 
neglected some low-order terms. We now model 
these terms by a noise and we are led to a new 
randomly forced dynamical system that reads 


Uy + Uy + Uxxx + Ux = YË [25] 


Here one may assume that (x,t) is a Gaussian 
process with correlations 


E(£(x, t)&(y, s)) = Óx—yÓt-s [26] 


that is, a spacetime white noise. The parameter ^ 
is the amplitude of the process. Unfortunately, due 
to the lack of smoothing effect of KdV or 
Schrödinger equations, it is more convenient to 
work with a noise that is correlated in space, 
satisfying 


E(£(x, t)&(y,s)) = c(x — y)6. [27] 


here c(x — y) is some smooth ansatz for x-y, defined 
from some Hilbert-Schmidt kernel K as 


c(x —y)— | KaKa) dz 


We also consider random perturbation of focusing 
Schródinger equation, which reads either 


u, + lues + i|u| ^u = ué [28] 
(which represents a multiplicative noise) or 
Uy + ity + ilu| Pu = iyé [29] 


(which is an additive noise). In the former case, the 
noise acts as a potential, while in the latter case it 
represents a forcing term. These equations also 
model the propagation of waves in an inhomoge- 
neous medium. 

Research is in progress to study these stochastic 
dynamical systems. To begin with, the theory of the 
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initial-value problem has to be established in this 
new context (see, e.g., de Bouard and Debussche 
(2003)). 

One challenge is to understand the effect of noise 
on dynamical properties of the particular solutions 
described above, for instance, the solitary waves for 
Schrédinger equation, either in the subcritical case 
p < 2 or in the critical case p=2 and beyond. 

Results obtained both theoretically and numeri- 
cally on the influence of the noise on blow-up 
phenomena (random process) for generalized Schró- 
dinger equations are likely almost-sure results. 

On the one hand, if the noise is additive and the 
power supercritical, p > 1, there is some numerical 
evidence that a spacetime white noise can delay or 
even prevent the blow-up. However, if the noise is 
not so irregular (as for the correlated in space noise 
described above) it seems that any solution blows up 
in finite time. 

de Bouard and Debussche have proved that for 
either an additive or a multiplicative noise, any 
smooth and localized (in space) initial data give rise 
to a trajectory that collapses in arbitrarily small 
time with a positive probability. This contrasts 
with the deterministic case, where only particular 
initial data could lead to blow-up trajectories. 
Actually, the noise enforces that any trajectory 
must pass through this blow-up region, with a 
positive probability. 
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Introduction 


Effective field theories (EFTs) are the counterpart of 
the “theory of everything.” They are the field 
theoretical implementation of the quantum ladder: 
heavy degrees of freedom need not be included 
among the quantum fields of an EFT for a 
description of low-energy phenomena. For example, 
we do not need quantum gravity to understand the 
hydrogen atom nor does chemistry depend upon the 
structure of the electromagnetic interaction of 
quarks. 

EFTs are approximations by their very nature. 
Once the relevant degrees of freedom for the 
problem at hand have been established, the corre- 
sponding EFT is usually treated perturbatively. It 
does not make much sense to search for an exact 
solution of the Fermi theory of weak interactions. In 
the same spirit, convergence of the perturbative 
expansion in the mathematical sense is not an issue. 
The asymptotic nature of the expansion becomes 
apparent once the accuracy is reached where effects 
of the underlying “fundamental” theory cannot be 
neglected any longer. The range of applicability of 
the perturbative expansion depends on the separa- 
tion of energy scales that define the EFT. 

EFTs pervade much of miodern physics. The 
effective nature of the description is evident in 
atomic and condensed matter physics. The present 
article will be restricted to particle physics, where 
EFTs have become important tools during the last 
25 years. 


Classification of EFTs 


A first classification of EFTs is based on the 
structure of the transition from the fundamental" 
(energies > A) to the “effective” level (energies — A). 


1. Complete decoupling The fundamental the- 
ory contains heavy and light degrees of freedom. 


Under very general conditions (decoupling theorem, 
Appelquist and Carazzone 1975) the effective 
Lagrangian for energies<A, depending only on 
light fields, takes the form 


1 
Lett = Lad<4 + F Tii > gi, Oi, [1] 
d>4 l4 
The heavy fields with masses» A have 


been “integrated out" completely. £,;-4 contains 
the potentially renormalizable terms with operator 
dimension d < 4 (in natural mass units where Bose 
and Fermi fields have d — 1 and 3/2, respectively), 
the g; are coupling constants and the Oj, are 
monomials in the light fields with operator dimen- 
sion d. In a slightly misleading notation, £;-4 
consists of relevant and marginal operators, whereas 
the O;, (d > 4) are denoted irrelevant operators. The 
scale A can be the mass of a heavy field (e.g., Mw in 
the Fermi theory of weak interactions) or it reflects 
the short-distance structure in a more indirect way. 

2. Partial decoupling | In contrast to the previous 
case, the heavy fields do not disappear completely 
from the EFT but only their high-momentum modes 
are integrated out. The main area of application is 
the physics of heavy quarks. The procedure involves 
one or several field redefinitions introducing a frame 
dependence. Lorentz invariance is not manifest but 
implies relations between coupling constants of the 
EFT (reparametrization invariance). 

3. Spontaneous symmetry breaking The transi- 
tion from the fundamental to the effective level 
occurs via a phase transition due to spontaneous 
symmetry breaking generating (pseudo-)Goldstone 
bosons. A spontaneously broken symmetry relates 
processes with different numbers of Goldstone 
bosons. Therefore, the distinction between renorma- 
lizable (d € 4) and nonrenormalizable (d > 4) parts 
in the effective Lagrangian [1] becomes meaningless. 
The effective Lagrangian of type 3 is generically 
nonrenormalizable. Nevertheless, such Lagrangians 
define perfectly consistent quantum field theories at 
sufficiently low energies. Instead of the operator 
dimension as in [1], the number of derivatives of 
the fields and the number of symmetry-breaking 
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insertions distinguish successive terms in the Lagran- 
gian. The general structure of effective Lagrangians 
with spontaneously broken symmetries is largely 
independent of the specific physical realization 
(universality). There are many examples in 
condensed matter physics, but the two main 
applications in particle physics are electroweak 
symmetry breaking and chiral perturbation theory 
(both discussed later) with the spontaneously broken 
global chiral symmetry of quantum chromody- 
namics QCD. 


Another classification of EFTs is related to the 
status of their coupling constants. 


A. Coupling constants can be determined by match- 
ing the EFT with the underlying theory at short 
distances. The underlying theory is known and 
Green functions can be calculated perturbatively 
at energies ~A both in the fundamental and in 
the effective theory. Identifying a minimal set of 
Green functions fixes the coupling constants g;, 
in eqn [1] at the scale A. Renormalization group 
equations can then be used to run the couplings 
down to lower scales. The nonrenormalizable 
terms in the Lagrangian [1] can be fully included 
in the perturbative analysis. 

B. Coupling constants are constrained by symme- 
tries only. 

e The underlying theory and therefore also the 
EFT coupling constants are unknown. This is 
the case of the standard model (SM) (see the 
next section). A perturbative analysis beyond 
leading order only makes sense for the known 
renormalizable part Cy<4. The nonrenormaliz- 
able terms suppressed by powers of A are 
considered at tree level only. The associated 
coupling constants g;, serve as bookmarks for 
new physics. Usually, but not always (cf., e.g., 
the subsection *Noncommutative spacetime"), 
the symmetries of Ly<4 are assumed to 
constrain the couplings. 

e The matching cannot be performed in perturba- 
tion theory even though the underlying theory 
is known. This is the generic situation for EFTs 
of type 3 involving spontaneous symmetry 
breaking. The prime example is chiral perturba- 
tion theory as the EFT of QCD at low energies. 


The SM as an EFT 


With the possible exception of the scalar sector to be 
discussed in the subsection “Electroweak symmetry 
breaking" the SM is very likely the renormalizable 
part of an EFT of type 1B. Except for nonzero 
neutrino masses, the SM Lagrangian Ly<4 in [1] 


accounts for physics up to energies of roughly 
the Fermi scale GF/” ~ 300 GeV. 

Since the SM works exceedingly well up to the 
Fermi scale where the electroweak gauge symmetry 
is spontaneously broken it is natural to assume that 
the operators O;, with d » 4, made up from fields 
representing the known degrees of freedom and 
including a single Higgs doublet in the SM proper, 
should be gauge invariant with respect to the full 
SM gauge group SU(3). xSU(2, x U(1)y. An 
almost obvious constraint is Lorentz invariance 
that will be lifted in the next subsection, however. 

These requirements limit the Lagrangian with 
operator dimension d— 5 to a single term (except 
for generation multiplicity), consisting only of a left- 
handed lepton doublet Lı, and the Higgs doublet 4: 


Odes = eijey Lj; C! Lg 99, + h.c. [2] 


This term violates lepton number and generates 
nonzero Majorana neutrino masses. For a neutrino 
mass of 1eV, the scale A would have to be of the 
order of 10" GeV if the associated coupling con- 
stant in the EFT Lagrangian [1] is of order 1. 

In contrast to the simplicity for d — 5, the list of 
gauge-invariant operators with d—6 is enormous. 
Among them are operators violating baryon or 
lepton number that must be associated with a scale 
much larger than 1 TeV. To explore the territory 
close to present energies, it therefore makes sense to 
impose baryon and lepton number conservation on 
the operators with d — 6. Those operators have all 
been classified (Buchmüller and Wyler 1986) and the 
number of independent terms is of the order of 80. 
They can be grouped in three classes. 

The first class consists of gauge and Higgs fields 
only. The corresponding EFT Lagrangian has been 
used to parametrize new physics in the gauge sector 
constrained by precision data from LEP. The second 
class consists of operators bilinear in fermion fields, 
with additional gauge and Higgs fields to generate 
d=6. Finally, there are four-fermion operators 
without other fields or derivatives. Some of the 
operators in the last two groups are also constrained 
by precision experiments, with a certain hierarchy of 
limits. For lepton and/or quark flavor conserving 
terms, the best limits on A are in the few TeV range, 
whereas the absence of neutral flavor changing 
processes yields lower bounds on A that are several 
orders of magnitude larger. If there is new physics in 
the TeV range flavor changing neutral transitions 
must be strongly suppressed, a powerful constraint 
on model building. 

It is amazing that the most general renormalizable 
Lagrangian with the given particle content accounts 
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for almost all experimental results in such an 
impressive manner. Finally, we recall that many of 
the operators of dimension 6 are also generated in the 
SM via radiative corrections. A necessary condition 
for detecting evidence for new physics is therefore 
that the theoretical accuracy of radiative corrections 
matches or surpasses the experimental precision. 


Noncommutative Spacetime 


Noncommutative geometry arises in some string 
theories and may be expected on general grounds 
when incorporating gravity into a quantum field 
theory framework. The natural scale of noncommu- 
tative geometry would be the Planck scale in this 
case without observable consequences at presently 
accessible energies. However, as in theories with 
large extra dimensions the characteristic scale Anc 
could be significantly smaller. In parallel to theoret- 
ical developments to define consistent noncommu- 
tative quantum field theories (short for quantum 
field theories on noncommutative spacetime), a 
number of phenomenological investigations have 
been performed to put lower bounds on Anc. 
Noncommutative geometry is a deformation of 
ordinary spacetime where the coordinates, repre- 
sented by Hermitian operators Xu do not commute: 


Cm Èy] = 10, [3| 


The antisymmetric real tensor 0,, has dimensions 
length" and it can be interpreted as parametrizing 
the resolution with which spacetime can be probed. 
In practically all applications, 0,,, has been assumed 
to be a constant tensor and we may associate an 
energy scale Anc with its nonzero entries: 


Asc E. um [4] 


There is to date no unique form for the noncommu- 
tative extension of the SM. Nevertheless, possible 
observable effects of noncommutative geometry have 
been investigated. Not unexpected from an EFT point 
of view, for energies <Aync, noncommutative field 
theories are equivalent to ordinary quantum field 
theories in the presence of nonstandard terms contain- 
ing 0,,, (Seiberg-Witten map). Practically all applica- 
tions have concentrated on effects linear in 6,,,. 

Kinetic terms in the Lagrangian are in general 
unaffected by the noncommutative structure. New 
effects arise therefore mainly from renormalizable 
d=4 interactions terms. For example, the Yukawa 
coupling gyo generates the following interaction 
linear in 6,,: 


LNC = gy0,, (O p bó + "Yd" + GO" Wa") [5] 


These interaction terms have operator dimension 6 and 
they are suppressed by 6,,, ~ Axe. The major differ- 
ence to the previous discussion on physics beyond the 
SM is that there is an intrinsic violation of Lorentz 
invariance due to the constant tensor 0,,,. In contrast to 
the previous analysis, the terms with dimension d > 4 
do not respect the symmetries of the SM. 

If 0,, is indeed constant over macroscopic 
distances, many tests of Lorentz invariance can be 
used to put lower bounds on Anc. Among the exotic 
effects investigated are modified dispersion relations 
for particles, decay of high-energy photons, charged 
particles producing Cerenkov radiation in vacuum, 
birefringence of radiation, a variable speed of light, 
etc. A generic signal of noncommutativity is the 
violation of angular momentum conservation that 
can be searched for at the Large Hadron Collider 
(LHC) and at the next linear collider. 

Lacking a unique noncommutative extension of 
the SM, unambiguous lower bounds on ANC are 
difficult to establish. However, the range 
Anc $10 TeV is almost certainly excluded. An 
estimate of the induced electric dipole moment of 
the electron (noncommutative field theories violate 
CP in general to first order in 0,,) yields 
Anc Z 100 TeV. On the other hand, if the SM were 
CP invariant, noncommutative geometry would be 
able to account for the observed CP violation in 
K? — K? mixing for Auc ~ 2 TeV. 


Electroweak Symmetry Breaking 


In the SM, electroweak symmetry breaking is 
realized in the simplest possible way through 
renormalizable interactions of a scalar Higgs doub- 
let with gauge bosons and fermions, a gauged 
version of the linear o model. 

The EFT version of electroweak symmetry 
breaking (EWEFT) uses only the experimentally 
established degrees of freedom in the SM (fermions 
and gauge bosons). Spontaneous gauge symmetry 
breaking is realized nonlinearly, without introducing 
additional scalar degrees of freedom. It is a low- 
energy expansion where energies and masses are 
assumed to be small compared to the symmetry- 
breaking scale. From both  perturbative and 
nonperturbative arguments we know that this scale 
cannot be much bigger than 1 TeV. The Higgs 
model can be viewed as a specific example of an 
EWEFT as long as the Higgs boson is not too light 
(heavy-Higgs scenario). 

The lowest-order effective Lagrangian takes the 
following form: 


2 
ss = LB + £r [6] 
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where Lp contains the gauge-invariant kinetic terms for 
quarks and leptons including mass terms. In addition to 
the kinetic terms for the gauge bosons W,,, B,, the 
bosonic Lagrangian £g contains the characteristic 
lowest-order term for the would-be-Goldstone bosons: 


" 
ki K 
pp = raise T a 


(D,U'D"U) [7] 
with the gauge-covariant derivative 


D,U —0,U — igW,U + ig'UB, 
r ; T3 
W, = 4 Ws B, — 3 Pu 

where (...) denotes a (two-dimensional) trace. The 
matrix field U(ó) carries the nonlinear representa- 
tion of the spontaneously broken gauge group and 
takes the value U— 1 in the unitary gauge. The 
Lagrangian [6] is invariant under local SU(2), x 
U(1)y transformations: 


Lj 


Eu 
Wu > gt W,gi * ZSL Ou8L 


1 


B, T B, T z SRB P 


fi gift: fagfü U— gLUgk 


with 


gi (x) = exp(iaj (x)7/2) 
gn(x) = exp(iay(x)73/2) 


and fur) are quark and lepton fields grouped in 
doublets. 

As is manifest in the unitary gauge U — 1, the lowest- 
order Lagrangian of the EWEFT just implements the 
tree-level masses of gauge bosons (Mw = Mz cos tw = 
vg/2, tan Ow —g'/g) and fermions but does not carry 
any further information about the underlying mechan- 
ism of spontaneous gauge symmetry breaking. This 
information is first encoded in the couplings a; of the 
next-to-leading-order Lagrangian 


14 
4 
isses =), di [10] 
i=0 


with monomials O; of O(p*) in the low-energy 
expansion. The Lagrangian [10] is the most general 
CP and SU(2), x U(1)y invariant Lagrangian of O(p?). 

Instead of listing the full Lagrangian, we display 
three typical examples: 


Oo — T (TV,)° 


O3 - —g(W,,[V", v") 1 1| 
Os = (V, V^? 


where 


T = Ur3U', Va = DUU’ 


i | | [12] 
Wy = z |ð, — ig Wp 0, — ig W,,| 


In the unitary gauge, the monomials O; reduce to 
polynomials in the gauge fields. The three examples 
in eqn [11] start with quadratic, cubic, and quartic 
terms in the gauge fields, respectively. The strongest 
constraints exist for the coefficients of quadratic 
contributions from the Large Electron—Positron 
collider LEP1, less restrictive ones for the cubic 
self-couplings from LEP2, and none so far for the 
quartic ones. 


Heavy-Quark Physics 


EFTs in this section are derived from the SM and 
they are of type 2A in the classification introduced 
previously. In a first step, one integrates out W, Z, 
and top quark. Evolving down from My to mp, 
large logarithms as(zwzp)ln (Mt /m) are resummed 
into the Wilson coefficients. At the scale of the 
b-quark, QCD is still perturbative, so that at least a 
part of the amplitudes is calculable in perturbation 
theory. To separate the calculable part from the rest, 
the EFTs below perform an expansion in 1/mo, 
where mo is the mass of the heavy quark. 

Heavy-quark EFTs offer several important 
advantages. 


1. Approximate symmetries that are hidden in full 
QCD appear in the expansion in 1/719. 

2, Explicit calculations simplify in general, for 
example, the summing of large logarithms via 
renormalization group equations. 

3. The systematic separation of hard and soft effects 
for certain matrix elements (factorization) can be 
achieved much more easily. 


Heavy-Quark Effective Theory 


Heavy-quark effective theory (HQET) is reminiscent 
of the Foldy-Wouthuysen transformation (nonrela- 
tivistic expansion of the Dirac equation). It is a 
systematic expansion in 1/mo, when mo > Aqcn, 
the scale parameter of QCD. It can be applied to 
processes where the heavy quark remains essentially 
on shell: its velocity v changes only by small 
amounts ~Agcp/mg. In the hadron rest frame, the 
heavy quark is almost at rest and acts as a 
quasistatic source of gluons. 

More quantitatively, one writes the heavy-quark 
momentum as p“=mov" +k", where v is the 
hadron 4-velocity (w=1) and k is a residual 


momentum of O(Aocp). The heavy quark field Q(x) 
is then decomposed with the help of energy 
projectors P? —(1--47)/2 and employing a field 
redefinition: 


Q(x) = e "e"* (b, (x) + Hy(x)) 
b,(x) "- catal PF O(x) [13] 
H,(x) = e” * P Olx) 


In the hadron rest frame, b,(x) and H,(x) corre- 
spond to the upper and lower components of Q(x), 
respectively. With this redefinition, the heavy-quark 
Lagrangian is expressed in terms of a massless field 
h, and a “heavy” field H,: 


£o = Q(iD - mo)O 
= h,iv- Db, — H,(iv - D + 2mo)H, 
+ mixed terms [14] 


At the semiclassical level, the field H, can 
be eliminated by using the QCD field equation (1 用 一 
mo)O — O0 yielding the nonlocal expression 


1 


Lo= h, lU - Dh, +h, aD Ling ie 


iD, b, [15] 


with D^ —(g"— v^v")D,. The field redefinition in 
[13] ensures that, in the heavy-hadron rest frame, 
derivatives of b, give rise to small momenta of 
O(Agcp) only. The Lagrangian [15] is the starting 
point for a systematic expansion in mo. 

To leading order in 1/mo(Q — b, c), the Lagrangian 


C. b,iv- Db, 4- Giv- De, [16] 


exhibits two important approximate symmetries of 
HQET: the flavor symmetry SU(2); relating heavy 
quarks moving with the same velocity and the 
heavy-quark spin symmetry generating an overall 
SU(4) spin-flavor symmetry. The flavor symmetry is 
obvious and the spin symmetry is due to the absence 
of Dirac matrices in [16]:-both spin degrees of 
freedom couple to gluons in the same way. The 
simplest spin-symmetry doublet consists of a pseu- 
doscalar meson H and the associated vector meson H*. 
Denoting the doublet by H, the matrix elements of the 
heavy-to-heavy transition current are determined to 
leading order in 1/mo by a single form factor, up to 
Clebsch-Gordan coefficients: 


(H(v')|hy Thy [H(v)) ~ E(u - v) [17] 


[ is an arbitrary combination of Dirac matrices and 
the form factor € is the so-called Isgur-Wise 
function. Moreover, since b,?"b, is the Noether 
current of heavy-flavor symmetry, the Isgur-Wise 
function is fixed in the no-recoil limit v' =v to be 
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£(v-v' —1)-—1. The semileptonic decays B — Divi 
and B — D*lw, are therefore governed by a single 
normalized form factor to leading order in 1/mo, 
with important consequences for the determination 
of the Cabibbo-Kobayashi-Maskawa (CKM ) matrix 
element V4. 

The HQET Lagrangian is superficially frame 
dependent. Since the SM is Lorentz invariant, the 
HQET Lagrangian must be independent of the 
choice of the frame vector v. Therefore, a shift in v 
accompanied by corresponding shifts of the fields þh, 
and of the covariant derivatives must leave the 
Lagrangian invariant. This reparametrization invari- 
ance is unaffected by renormalization and it relates 
coefficients with different powers in 1/mo. 


Soft-Collinear Effective Theory 


HQET is not applicable in heavy-quark decays 
where some of the light particles in the final state 
have momenta of O(mg), for example, for inclusive 
decays like B — X; or exclusive ones like B — 77. 
In recent years, a systematic heavy-quark expansion 
for heavy-to-light decays has been set up in the form 
of soft-collinear effective theory (SCET). 

SCET is more complicated than HQET because 
now the low-energy theory involves more than one 
scale. In the SCET Lagrangian, a light quark or 
gluon field is represented by several effective fields. 
In addition to the soft fields h, in [15], the so-called 
collinear fields enter that have large energy and 
carry large momentum in the direction of the light 
hadrons in the final state. 

In addition to the frame vector v of HQET 
(v—(1,0,0,0) in the heavy-hadron rest frame), 
SCET introduces a lightlike reference vector m in 
the direction of the jet of energetic light particles 
(for inclusive decays), for example, n= (1,0,0, 1). 
All momenta p are decomposed in terms of light- 
cone coordinates (p4, p_, pi) with 


n- d FEES de pes 


p" = LE po qu - 8i 
where 2—2v —n=(1,0,0,—1). For large energies, 
the three light-cone components are widely 
separated, with p. = O(mo) being large while p; 
and p+ are small. Introducing a small parameter À ~ 
pı/p-, the light-cone components of (hard-)colli- 
near particles scale like (p,,p_,p.)=mo()’, 1, A). 
Thus, there are three different scales in the problem 
compared to only two in HQET. For exclusive 
decays, the situation is even more involved. 

The SCET Lagrangian is obtained from the full 
theory by an expansion in powers of A. In addition 
to the heavy quark field 5,, one introduces soft as 
well as collinear quark and gluon fields by field 
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redefinitions so that the various fields have momen- 
tum components that scale appropriately with A. 

Similar to HQET, the leading-order Lagrangian of 
SCET exhibits again approximate symmetries that 
can lead to a reduction of form factors describing 
heavy-to-light decays. As in HQET, reparametriza- 
tion invariance implements Lorentz invariance and 
results in stringent constraints on subleading correc- 
tions in SCET. 

An important result of SCET is the proof of 
factorization theorems to all orders in a,. For 
inclusive decays, the differential rate is of the form 


dr ~ HJ xS [19] 


where H contains the hard corrections. The 
so-called jet function J sensitive to the collinear 
region is convoluted with the shape function S 
representing the soft contributions. At leading order, 
the shape function drops out in the ratio of weighted 
decay spectra for B — X,/1, and B — X; allowing 
for a determination of the CKM matrix element Vp. 
Factorization theorems have become available for an 
increasing number of processes, most recently also 
for exclusive decays of B into two light mesons. 


Nonrelativistic QCD 


In HQET the kinetic energy of the heavy quark 
appears as a small correction of O(AGcp/mg). For 
systems with more than one heavy quark, the kinetic 
energy cannot be treated as a perturbation in 
general. For instance, the virial theorem implies 
that the kinetic energy in quarkonia OO is of the 
same order as the binding energy of the bound state. 

NRQCD, the EFT for heavy quarkonia, is an 
extension of HQET. The Lagrangian for NRQCD 
coincides with HQET in the bilinear sector of the 
heavy-quark fields but it also includes quartic 
interactions between quarks and antiquarks. The 
relevant expansion parameter in this case is the 
relative velocity between O and OQ. In contrast to 
HQET, there are at least three widely separate scales 
in heavy quarkonia: in addition to mọ, the relative 
momentum of the bound quarks p ~ mov with v < 1 
and the typical kinetic energy E ~ mov”. The main 
challenges are to derive the quark-antiquark potential 
directly from QCD and to describe quarkonium 
production and decay at collider experiments. In the 
abelian case, the corresponding EFT for quantum 
electrodynamics (QED) is called NRQED that has 
been used to study electromagnetically bound systems 
like the hydrogen atom, positronium, muonium, etc. 

In NRQCD only the hard degrees of freedom with 
momenta ~mo are integrated out. Therefore, 
NRQCD is not enough for a systematic computation 


of heavy-quarkonium properties. Because the non- 
relativistic fluctuations of order mov and mo^ have 
not been separated, the power counting in NRQCD 
is ambiguous in higher orders. 

To overcome those deficiencies, two approaches 
have been put forward: potential NRQCD 
(pNRQCD) and velocity NRQCD (vNRQCD). In_ 
pNRQCD, a two-step procedure is employed for 
integrating out quark and gluon degrees of freedom: 


QCD A> mo 
NRQCD mo > A> mov 
y 
pNRQCD mov > A > mov 


The resulting EFT derives its name from the fact 
that the four-quark interactions generated in the 
matching procedure are the potentials that can be 
used in Schrödinger perturbation theory. It is 
claimed that pNRQCD can also be used in the 
nonperturbative domain where os(mow) is of order 1 
or larger. The advantage would be that also charmo- 
nium becomes accessible to a systematic EFT analysis. 

The alternative approach of vNRQCD is only 
applicable in the fully perturbative regime when 
mo > mov > mov? > Aocp is valid. It separates 
the different degrees of freedom in a single step 
leaving only ultrasoft energies and momenta of 
O(mov^) as continuous variables. The separation 
of larger scales proceeds in a similar fashion as in 
HQET via field redefinitions. A systematic nonrela- 
tivistic power counting in the velocity v is 
implemented. 


The Standard Model at Low Energies 


At energies below 1 GeV, hadrons — rather than quarks 
and gluons — are the relevant degrees of freedom. 
Although the strong interactions are highly nonpertur- 
bative in the confinement region, Green functions and 
amplitudes are amenable to a systematic low-energy 
expansion. The key observation is that the QCD 
Lagrangian with N; — 2 or 3 light quarks, 


Loc = Glip — My)q — 1G5, G^" + Lheavy quarks 
= quiqi + Fri Par — qi Moqn 
= FEM aid. «e 
qr =}(1+7s)q, q' = (ud[s]) [20] 


exhibits a global symmetry 


SU(N;, xSU(N;), xU(1)y x U(1)a [21] 
—MM—— 


chiral group G 
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in the limit of Ny massless quarks (Ma — 0). At the 
hadronic level, the quark number symmetry U(1)y is 
realized as baryon number. The axial U(1)4 is not a 
symmetry at the quantum level due to the abelian 
anomaly. 

Although not yet derived from first principles, 
there are compelling theoretical and phenomenolo- 
gical arguments that the ground state of QCD is 
not even approximately chirally symmetric. All 
evidence, such as the existence of relatively light 
pseudoscalar mesons, points to spontaneous chiral 
symmetry breaking G 一 SU(N;)y, where SU(N;)y is 
the diagonal subgroup of G. The resulting N7 — 1 
(pseudo-)Goldstone bosons interact weakly at low 
energies. In fact, Goldstone's theorem ensures that 
purely mesonic or single-baryon amplitudes vanish 
in the chiral limit (M; =0) when the momenta of all 
pseudoscalar mesons tend to zero. This is the basis 
for a systematic low-energy expansion of Green 
functions and amplitudes. The corresponding EFT 
(type 3B in our classification) is called chiral 
perturbation theory (CHPT) (Weinberg 1979, 
Gasser and Leutwyler 1984, 1985). 

Although the construction of effective Lagran- 
gians with nonlinearly realized chiral symmetry is 
well understood, there are some subtleties involved. 
First of all, there may be terms in a chiral-invariant 
action that cannot be written as the four- 
dimensional integral of an invariant Lagrangian. 
The chiral anomaly for SU(3) x SU(3) bears witness 
of this fact and gives rise to the Wess-Zumino- 
Witten action. A general theorem to account for 
such exceptional cases is due to D'Hoker and 
Weinberg (1994). Consider the most general action 
for Goldstone fields with symmetry group G, 
spontaneously broken to a subgroup H. The only 
possible non-G-invariant terms in the Lagrangian 
that give rise to a G-invariant action are in one-to- 
one correspondence with the generators of the fifth 
cohomology group H°(G/H;R) of the coset mani- 
fold G/H. For the relevant case of chiral SU(N), the 
coset space SU(N); x SU(N)p/SU(N)y is itself an 
SU(N) manifold. For N > 3,H°(SU(N);R) has a 
single generator that corresponds precisely to the 
Wess—Zumino—Witten term. 

At a still deeper level, one may ask whether chiral- 
invariant Lagrangians are sufficient (except for the 
anomaly) to describe the low-energy structure of 
Green functions as dictated by the chiral Ward 
identities of QCD. To be able to calculate such 
Green functions in general, the global chiral sym- 
metry of QCD is extended to a local symmetry by 
the introduction of external gauge fields. The 
following invariance theorem (Leutwyler 1994) 
provides an answer to the above question. Except 


for the anomaly, the most general solution of the 
Ward identities for a spontaneously broken symme- 
try in Lorentz-invariant theories can be obtained 
from gauge-invariant Lagrangians to all orders in 
the low-energy expansion. The restriction to Lorentz 
invariance is crucial: the theorem does not hold in 
general in nonrelativistic effective theories. 


Chiral Perturbation Theory 


The effective chiral Lagrangian of the SM in the 
meson sector is displayed in Table 1. The lowest- 
order Lagrangian for the purely strong interactions 
is given by 
F2 
Lp = -z (Pu UD" ut) 
2 


十 a ((s--ip)U! --(s—ip)U) [22] 
with a covariant derivative D = 0,U — i(v,, + a,)U + 
IU(v,, — a,). The first term has the familiar form [7] 
of the gauged nonlinear o model, with the matrix 
field U(ó) transforming as U — gg Ugi under chiral 
rotations. External fields v,,a,,s,p are introduced 
for constructing the generating functional of Green 
functions of quark currents. To implement explicit 
chiral symmetry breaking, the scalar field s is set 
equal to the quark mass matrix M, at the end of the 
calculation. 
The leading-order Lagrangian has two free para- 
meters F, B related to the pion decay constant and to 
the quark condensate, respectively: 


F, = F[1 + O(m,)| 


- [23] 

(O|#u|0) = — FB + O(m,)| 
The Lagrangian [22] gives rise to M2 = B(m, + ma) 
at lowest order. From detailed studies of pion—pion 
scattering (Colangelo et al. 2001), we know that the 
leading term accounts for at least 94% of the pion 
mass. This supports the standard counting of CHPT, 


Table 1 The effective chiral Lagrangian of the SM in the 
meson sector 


L chiral order (# Of LECS) Loop order 


Lp2(2) + Cae (2) + LE (1) 4- Lae (1) L=0 
+L (10) + £08°(32) + £a (22) + LS 7 (28) | 1 
+ Cen (14) + Las (14) + Lp (5) 


+Lps(90) L=2 


The numbers in brackets refer to the number of independent 
couplings for N; —3.. The parameter-free Wess-Zumino-Witten 


action Swzw that cannot be written as the four-dimensional 
integral of an invariant Lagrangian must be added. 
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with quark masses booked as O(p?) like the two- 
derivative term in [22]. 

The effective chiral Lagrangian 
contains the following parts: 


in Table 1 


1. strong interactions: Lop DE Logs + Swzw; 
2. nonleptonic weak interactions to first order in 
the Fermi coupling constant Gr: Gp 
[55-1 £55-1. 
Gp > Gap? . 
3. radiative corrections for 
em em , 
e? p?» ep?) 
4. radiative corrections for 


decays: Leo s Leasa; and 


5. radiative corrections for 


. pleptons 
decays: c, | 


Beyond the leading order, unitarity and analyticity 
require the inclusion of loop contributions. In the 
purely strong sector, calculations have been per- 
formed up to next-to-next-to-leading order. Figure 1 
shows the corresponding skeleton diagrams of O(p*), 
with full lowest-order tree structures to be attached 
to propagators and vertices. The coupling constants 
of the various Lagrangians in Table 1 absorb the 
divergences from loop diagrams leading to finite 
renormalized Green functions with scale-dependent 
couplings, the so-called low-energy constants (LECs). 
As in all EFTs, the LECs parametrize the effect of 
“heavy” degrees of freedom that are not represented 
explicitly in the EFT Lagrangian. Determination of 
those LECs is a major task for CHPT. In addition to 
phenomenological information, further theoretical 
input is needed. Lattice gauge theory has already 
furnished values for some LECs. To bridge the gap 
between the low-energy domain of CHPT and the 
perturbative domain of QCD, large-N, motivated 
interpolations with meson resonance exchange have 
been used successfully to pin down some of the LECs. 

Especially in cases where the knowledge of LECs is 
limited, renormalization group methods provide 
valuable information. As in renormalizable quantum 
field theories, the leading chiral logs (In M?/p2)" 


CO €o O-Q 


strong processes: 


nonleptonic weak 


semileptonic weak 


69———9 i] 


Figure 1 Skeleton diagrams of O(p9).. Normal vertices are 
from £p, crossed circles and the full square denote vertices 
from Lp: and Lps, respectively. 


with a typical meson mass M, renormalization scale ji 
and loop order L can in principle be determined from 
one-loop diagrams only. In contrast to the renorma- 
lizable situation, new derivative structures (and quark 
mass insertions) occur at each loop order preventing a 
straightforward resummation of chiral logs. 

Among the many applications of CHPT in the 
meson sector are the determination of quark mass 
ratios and the analysis of pion-pion scattering where 
the chiral amplitude of next-to-next-to-leading order 
has been combined with dispersion theory (Roy 
equations). Of increasing importance for precision 
physics (CKM matrix elements, (g — 2); v) afe 
isospin-violating corrections including radiative cor- 
rections, where CHPT provides the only reliable 
approach in the low-energy region. Such corrections 
are also essential for the analysis of hadronic atoms 
like pionium, a 7^7z bound state. 

CHPT has also been applied extensively in the 
single-baryon sector. There are several differences to 
the purely mesonic case. For instance, the chiral 
expansion proceeds more slowly and the nucleon 
mass my provides a new scale that does not vanish in 
the chiral limit. The formulation of heavy-baryon 
CHPT was modeled after HQET integrating out the 
nucleon modes of O(myn). To improve the conver- 
gence of the chiral expansion in some regions of phase 
space, a manifestly Lorentz-invariant formulation has 
been set up more recently (relativistic baryon CHPT). 
Many single-baryon processes have been calculated to 
fourth order in both approaches, for example, pion- 
nucleon scattering. With similar methods as in the 
mesonic sector, hadronic atoms like pionic or kaonic 
hydrogen have been investigated. 


Nuclear Physics 


In contrast to the meson and single-baryon sectors, 
amplitudes with two or more nucleons do not vanish 
in the chiral limit when the momenta of Goldstone 
mesons tend to zero. Consequently, the power 
counting is different in the many-nucleon sector. 
Multinucleon processes are treated with different 
EFTs depending on whether all momenta are smaller 
or larger than the pion mass. 

In the very low energy regime |p| < M,, pions or 
other mesons do not appear as dynamical degrees of 
freedom. The resulting EFT is called “pionless EFT” 
and it describes systems like the deuteron, where the 
typical nucleon momenta are ~ymyBa œ 45 MeV 
(Bg is the binding energy of the deuteron). The 
Lagrangian for the strong interactions between two 
nucleons has the form 


CNN = Co(N' PN) N' PjN 4 --- [24] 
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where P; are spin-isospin projectors and higher- 
order terms contain derivatives of the nucleon fields. 
The existence of bound states implies that at least 
part of the EFT Lagrangian must be treated 
nonperturbatively. Pionless EFT is an extension of 
effective-range theory that has long been used in 
nuclear physics. It has been applied successfully 
especially to the deuteron but also to more compli- 
cated few-nucleon systems like the Nd and na 
systems. For instance, precise results for Nd scatter- 
ing have been obtained with parameters fully 
determined from NN scattering. Pionless EFT has 
also been applied to the so-called halo nuclei, where 
a tight cluster of nucleons (like *He) is surrounded 
by one or more “halo” nucleons. 

In the regime |p| > M,, the pion must be included as 
a dynamical degree of freedom. With some modifica- 
tions in the power counting, the corresponding EFT is 
based on the approach of Weinberg (1990, 1991), who 
applied the usual rules of the meson and single-nucleon 
sectors to the nucleon-nucleon potential (instead of 
the scattering amplitude). The potential is then to be 
inserted into a Schródinger equation to calculate 
physical observables. The systematic power counting 
leads to a natural hierarchy of nuclear forces, with 
only two-nucleon forces appearing up to next-to- 
leading order. Three- and four-nucleon forces arise at 
third and fourth order, respectively. 

Significant progress has been achieved in the 
phenomenology of few-nucleon systems. The two- 
and z-nucleon (3 € n € 6) sectors have been pushed to 
fourth and third order, respectively, with encouraging 
signs of “convergence.” Compton scattering off the 
deuteron, zd scattering, nuclear parity violation, solar 
fusion, and other processes have been investigated in 
the EFT approach. The quark mass dependence of the 
nucleon-nucleon interaction has also been studied. 


See also: Anomalies; Electroweak Theory; High Te 
Superconductor Theory; Noncommutative Geometry 
and the Standard Model; Operator Product Expansion 

in Quantum Field Theory; Perturbation Theory and its 
Techniques; Quantum Chromodynamics; Quantum 
Electrodynamics and its Precision Tests; 
Renormalization: General Theory; Seiberg-Witten 
Theory; Standard Model of Particle Physics; Symmetries 
and Conservation Laws; Symmetry Breaking in Field 
Theory. 
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Introduction 


This article is an introduction to eigenfunctions of 
quantum completely integrable (QCI) systems. For 
these systems, one can understand asymptotics of 
eigenfunctions better than for other systems, so it is 
natural to study them. It is useful to begin the 
discussion with the most important geometric exam- 
ple given by the quantum Hamiltonian, P4 = — V/A. 
We fix a basis of eigenfunctions, ;,j = 1,2,..., with 
-VAw; = Aj, (vivi) = ôi 
and assume that there exist functionally independent 
(pseudo)differential operators P5,...,P, with the 
property that 


[P;, P] = 0, 


一下 ie 


In this case, P, is said to be QCI and the operators, 
PL, k — 1,...,7,can be simultaneously diagonalized. It 
is therefore natural to study the special basis of Laplace 
eigenfunctions which are joint eigenvectors of the P; s. 
From now on, the ¢;’s are always assumed to be 
joint eigenfunctions of the commuting operators, 
PL, k — 1,...,n. The classical observables correspond- 
ing to the operators P}, k = 1,...,7, are the respective 
principal symbols, pr E€ C*(T*M),/—1,...,"». In 
particular, the bicharacteristic flow of p1(x,€)=|€|, 
is the classical “geodesic flow” 


G;:T'M— T'M 


Examples of manifolds with QCI Laplacians include 
tori and spheres of revolution, Liouville metrics on tori 
and spheres, large families of metrics on homogeneous 
spaces, as well as hyperellipsoids with distinct axes in 
arbitrary dimension. There are also many inhomoge- 
neous QCI examples (see the next section). It is of 
interest to understand the asymptotics of both eigen- 
values and eigenfunctions. There is a large literature 
devoted to eigenvalue asymptotics, including trace 
formulas and Bohr-Sommerfeld rules (see Colin de 
Verdiere (1994a, b), Helffer and Sjoestrand (1990), 
and Colin de Verdiere and Vu Ngoc (2003)). We will 
concentrate here on the corresponding problem of 
determining eigenfunction asymptotics. The key prop- 
erty of eigenfunctions in the QCI case is localization in 
phase space, T*M. This allows one to study more 
effectively the concentration and blow-up properties 
than in any other setting. It is important to contrast 
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this with, for example, the situation in the ergodic 
case. Moreover, in the QCI case, there is a particularly 
strong connection between dynamics of the geodesic 
flow, G;: T*M — T* M, and the asymptotics of indi- 
vidual eigenfunctions. In the general case, one can 
usually only relate the dynamics to spectral averages, 
such as in the trace formula (Duistermaat and 
Guillemin 1975). 

For the most part, the literature on eigenfunction 
asymptotics addresses the following basic problems: 


1. determining sharp upper and lower bounds for o; 
as Aj — oo and 

2. describing the link between the blow-up proper- 
ties of y; as A; — oo and the dynamics of the 
geodesic flow, G;. 


The starting point in the study of eigenfunction 
asymptotics in the QCI case is the fact that the joint 
eigenfunctions, Yj, have masses that localize on the 
level sets, P !(b):—((x,£) € T'Mip;(x,£) 2 b;,j — 
1,...,4]. Moreover, by the Liouville-Arnol'd theo- 
rem, for generic levels (indexed by b € R), 


[1] 


Y 
S 

| 
M: 
= 
= 
= 


k=1 


where the A,(b) C T*M are Lagrangian tori. The 
affine symplectic coordinates in a neighborhood of 
A,(b) are called “action-angle variables" ae 
go; 7°), TK) e T" x R”. Written in terms of 
these coordinates, the classical Hamilton equations 


defining the geodesic flow assume the form 


gg |. dI 
d oq 


and this system of ordinary differential equations 
(ODEs) is solved by quadrature. This explains why 
one refers to such systems as completely integrable. 
At the quantum level, one can construct semiclassi- 
cal Lagrangian distributions, 


0 


(x) [Malem d) dn 2) 
R” 


which microlocally concentrate on A (b) as 入 一 oo 
and satisfy Pj}®,=b,A\®, + O(A^*) in L^(M). An 
important fact is that the actual joint eigenfunctions, 
pj, are approximated to O(A^*)-accuracy in L^(M) by 
suitable linear combinations of the quasimodes, ®). 
However, there are subtleties underlying this correspon- 
dence which are often neglected in the physics literature: 


3. The actual joint eigenfunctions ¢; localize on the 
level sets 7"! (b) which usually consist of many 
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connected components. Consequently, the eigen- 
functions are approximated by (sometimes large) 
linear combinations of Lagrangian quasimodes 
attached to the different component tori. The 
precise splitting of mass amongst these different 
components is a difficult and, in general, 
unsolved problem in microlocal tunneling. 

4. The local torus foliation given by action-angle 
variables tends to degenerate and Lagrangian 
quasimodes are no longer approximate solutions 
to the (joint) eigenvalue equations near the 
singularities of the foliation. The singularities and 
their relative configurations can be complicated 
(Colin de Verdiere and Vu Ngoc 2003) and most 
of the interesting asymptotic blow-up properties 
of eigenfunctions tend to be associated with these 
degeneracies. The main tool for studying joint 
eigenfunctions near degeneracies is the quantum 
analog of the Eliasson normal form (Eliasson 
1984, Vu Ngoc 2000). We will refer to this as the 
“quantum Birkhoff normal form" (QBNF). 


Background on QCI Systems 


Let (M",g) be a compact, closed Riemannian 
manifold and P;:— Op;(pi) be a formally self- 
adjoint, elliptic (in the classical sense) 5-pseudodif- 
ferential operator. In local coordinates, the Schwarz 
kernel of P, is of the form, 


Palæ, yib) = (2nb)* [979p (s, Eh) dé 


where pi(x,£&;b) € $5 S (T* M); that is, PU & b) ~ 
P o b1 (x, £)P' with 0202 pi, (x; £) =O, HO 7" a 
(Dimassi and Sjoestrand 1999). It is often conve- 
nient to work with h-pseudodifferential operators 
rather than their classical counterparts. In the 
homogeneous case, one chooses 5^! € SpecVA. 

Pi € Opn (Ss, >”) is said to be QCI if there exist self- 
adjoint P;— Op»(b;) € Opp (S? "j-2,...n, for 
some 7i! with IP P1 = 0,47 = 1. ie "hal that 
dpi A---Adp, #0 on a dense open subset, Qyeg C 
T*M, and P? +---+P2 is elliptic in the classical 
sense. There are many inhomogeneous QCI examples 
including quantum Euler, Lagrange, and Kowalevsky 
tops together with quantum Neumann and Rosocha- 
tius oscillators in arbitrary dimension. 

Since [p;, p;] 2 0, the joint Hamilton flow of the 
pis induces a symplectic R"-action on T* M: 


®,: TTM—T'M 
$,(x, E) = expti Hy, o 
Pel. tn) € R” 


--oexp ti Hp, (x, £) 


The associated moment map is just 
P:T*M — 0 — R”, P= (ns >Re) 


We denote the image P(T*M — 0) by B, the regular 
values (resp. singular values) by Breg (resp. Bsing) of 
the moment map. 

To establish bounds for the joint eigenfunctions of 
P4, ..., P4, one imposes a “finite-complexity” assump- 
tion (Toth and Zelditch 2002) on the classical integrable 
system. This condition holds for all systems of interest in 
physics. To describe it, for each b=(b"),...,b™) € B, 
let malb) denote the number of R”-orbits of the joint 
flow 4, on the level set "!(b). Then, the finite- 
complexity condition says that for some Mo > 0, 


malb) « Mo(Vb E B) 


In addition, when 7 is proper, 
ma(b) 


ae? A,(b [3] 


for any b € Byeg, where the A,(b) are Lagrangian tori. 
The starting point for analyzing joint eigenfunctions 
is the following correspondence principle (Zelditch 
1990) which makes the eigenfunction localization 
alluded to in the introduction more precise: 


Theorem 1 Let Op; (a) € Op; (S5)(T* M) and P;,j = 
1,...,”, be a OCI system of commuting operators. 
Then, for every b € Byeg, there exists a subsequence of 
joint eigenfunctions p(x) := p(x; u(b)) with b € (0, bo] 
and joint eigenvalues | (b) — (p(B),..., Ln(b)) € 
Spec(P1, ..., P,) with |u(b) — b| = Olh) such that 


(Op; (a 4) Pus Pp) = - b) OP 人 a(x, £) dir T O(h) 


Here, dup denotes Lebesgue measure on the torus, A(b). 


The proof of Theorem 1 follows from the 5-microlocal, 
regular quantum normal construction near A(b) (see 
the section “Birkhoff normal forms"). 


Blow-Up of Eigenfunctions: 
Qualitative Results 


Before discussing quantitative bounds for joint 
eigenfunctions, it is useful to prove qualitative results. 
Here, we review only the homogeneous case where 
Pı =hV/A, although the general case can be dealt 
with similarly (Toth and Zelditch 2002). Two well- 
known QCI examples which exhibit extremes in 
eigenfunction concentration are the round sphere and 
the flat torus. In the case of the sphere, the zonal 
harmonics blow-up like A'/? at the poles, whereas, in 
the case of the flat torus, all the joint eigenfunctions 
are uniformly bounded. The rest of the article will be 
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essentially devoted to understanding these extreme 
blow-up properties (and intermediate ones) more 
systematically. When discussing blow-up of eigen- 
functions, it is natural to start with the following: 


Question Do there exist QCI manifolds (other 
than the flat torus) for which all eigenfunctions are 
uniformly bounded in L*? 


Toth and Zelditch (2002) have proved that, up to 
coverings, the flat torus is the only example with 
uniformly bounded eigenfunctions. Their argument 
used the correspondence principle in Theorem 1 
combined with some deep results from symplectic 
geometry. To deal with the issue of multiplicities, it 
is convenient to define 


L” (A; g) = sup |||; 
€ V) 


where V, ={y;P1¢), =Ayy} and it is assumed that 
lll = 1. 


Theorem 2 (Toth and Zelditch 2002). Suppose 
that Pi =VA is OCI on a compact, Riemannian 
manifold (M, g) and suppose that the corresponding 
moment map satisfies the finite-complexity condi- 
tion. Then, if L*(A, g) =O(1), (M, g) is flat. 


The proof of Theorem 2 follows by contradiction: 
that is, one assumes that all eigenfunctions are 
uniformly bounded. There are two main steps in the 
proof of Theorem 2: the first is entirely analytic and 
uses the correspondence principle in Theorem 1 and 
uniform boundedness to determine the topology of 
M. The second step uses two deep results from 
symplectic topology/geometry to determine the 
metric, g, up to coverings. 

Using a local Weyl law argument and the finite- 
multiplicity assumption, it can be shown that for 
each b € Byeg, there exists a subsequence, p, of joint 
eigenfunctions such that Proposition 1 holds with 
cb; b) > > 

~C 
where C » 0 is a uniform constant not depending 
on b € B,4. With this subsequence, one applies 
Theorem 1 with a(x,£) = V(x) € C*(M). It then easily 
follows by the boundedness assumption that for 5 
sufficiently small and appropriate constants Co, Cy > 0, 


人 " (vi v) diu, 


< a f IV (x)I les Go) dVol(x) 


< c 人 IV (x)| dVol (x) 4 


where 74/5, denotes the restriction of the canonical 
projection 7: T*M — M to the Lagrangian A(b). The 
estimate in [4] is equivalent to the statement, 


(AQ), (dup) < dVol(x) 


where given two Borel measures dj and dv, one 
writes du « dv if du is absolutely continuous with 
respect to dv. Consequently, 74/5 : A(b) —^ M has no 
singularities and thus, up to coverings, M is 
topologically a torus (since A(b) is). 

Since there are many QCI systems on -tori, it still 
remains to determine how the uniform-boundedness 
condition constrains the metric geometry of (M, g). 
First, by a classical result of Mane, if T*M possesses 
a Cl-foliation by Lagrangians, (M,g) cannot have 
conjugate points. By the first step in the proof, it 
follows that under the  uniform-boundedness 
assumption, M is a topological torus and T*M 
possesses a smooth foliation by Lagrangian tori. 
Consequently, (M,g) has no conjugate points. 
Finally, the Burago-lvanov proof of the Hopf 
conjecture says that metric tori without conjugate 
points are flat. Therefore, (M, g) is flat. 

Consistent with Theorem 2, one can show (Toth 
and Zelditch 2003, Lerman and Shirokova 2002) that 
if (M, g) is integrable and not a flat torus, then there 
must exist a compact ®,-orbit (i.e., an orbit of the joint 
flow of Xp, j= 1,...,n) with dim =k < n. In the QCI 
case, these “singular” orbits trap eigenfunction mass 
for appropriate subsequences. To understand this 
statement in detail, it is necessary to review QBNF 
constructions in the context of QCI systems. 


Birkhoff Normal Forms 


There are several excellent expositions on the topic 
of Birkhoff normal forms in the literature (see, e.g., 
Guillemin (1996), Iatchenko et al. (2002), and 
Zelditch (1998)), which discuss both the classical 
and quantum constructions. Here, we discuss the 
aspects which are most relevant for QCI systems. 

Consider the Schrödinger operator, P(x;/D,) = 
-b° (d? /dx?) + V(x) with Vix +2r)= V(x) acting 
on C*(R/2z7). Assume that the potential, V(x), is 
Morse and that x=0 is a potential minimum with 
V(0) = V'(0) — 0 and Q c T*(S') an open neighbor- 
hood containing (0,0). In its simplest incarnation, 
the classical Birkhoff normal-form theorem says that 
for small enough €, there exists a symplectic 
diffeomorphism, «7! :(05(0,0)) — (Q;(0,0));« 7: 
(x,£) — (y,7), and a (locally defined) function Fo € 
C^*(R) such that 


(p o K)(y,) = Fo(w -- y^) [5] 
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provided (y,n) EQ. At the quantum level, the 
analogous QBNF expansion says that there exist 
microlocally unitary b-Fourier integral operators, 


U(b):C*(Q)— C*(Q) and a classical symbol 
F(x,b) ~ rd F;(x)b', such that 
U(b)' o P(b) o U(b) = oF(I; b) [6] 


with I, =} D + y*. Given two b-pseudos P and O, 
the notation Pa O means that ||x(P — Q)||;:llg = 

OH”) and |(P— Q)xllo- O(b*), for any Xe 
Co (Q). Since it can be easily shown that eigenfunc- 
tions Y, with j(5) = O(h) 0 < 8 < 1, localize very 
sharply near x —0, from the Panicrolocal unitary 
equivalence in [6], the eigenfunction and eigenvalue 
asymptotics (including trace formulas) can all be 
determined by working with the model operator on 
the right-hand side (RHS) of [6]. Moreover, on the 
model side, the eigenfunctions and eigenvalues are 
explicitly known. 

At a potential maximum, there exist classical and 
quantum normal forms analogous to [5] and [6] (see 
Helffer and Sjoestrand (1990) and Colin de Verdiere 
and Parisse (1994a)) except that the harmonic 
oscillator action operator, l, is replaced by the 
hyperbolic action operator, 


lj = b(yDy +4) [7] 


The 1D Schrödinger operator is the simplest 
example of a QCI system where (0,0) € T*S! is a 
nondegenerate critical point of the classical Hamil- 
tonian, H(x,£) — £? + V(x). Under a mild nonde- 
generacy hypothesis (Vu Ngoc 2000), there is an 
analogous normal form for arbitrary QCI systems 
which is valid near nondegenerate rank k<n orbits 
of the joint flow, ®,. At the classical level, this result 
is due to Eliasson (1984) and the quantum analog is 
due to Vu-Ngoc (2000). To state the result is 
general, one has to define the appropriate model 
operators: these are I, and I, together with the 
loxodromic model operators Rİ., —5D;,, Sl., = 
bpD, + b/i, where (p,0) denote polar coordinates 
in R*. The local model phase space for a rank k < n 
orbit, O,, is just T*(T*) x T*(R"-^), In this case, the 
QBNF says that, for a sufficiently small neighbor- 
hood, G; of Op, - exists a family of 5-Fourier 
integral operators, U,: C*(6,) 一 Cos T*UE^) x1 
(R"-^)) and symbols fi (b) ~ — such that 


U P; Us — g, My: (O1 — fi(P),..., — fn(b)) 


8 
U* o U, =g Id 8) 


Here, M, is a microlocally invertible matrix of 
b-pseudodifferential operators commuting with the 
O;'s, and the O;'s are to be chosen from the list of 
model operators IL. b E. boh: where luo = 


(bDs,,..., bDs,) denotes the regular model operator 
acting along the k-dimensional orbit, ©. Moreover, 
if (Y1, -- -s Yn-kə Ms- - -s M-k) € T*(R” “) denote the 
symplectic model cborditates, then the O;'s act in 
separate, complementary  (yi,..., y, 4)-variables. 
The main point here is that [8] is actually a 
convergent normal norm in 5 in the sense that 
error terms in [8] are O(5^). In contrast (Guillemin 
1996, Iatchenko ez al. 2002, Zelditch 1998), the 
general Birkhoff normal form is only formal in the 
sense that error terms vanish to successively higher 
orders along the orbit, O,, but are not necessarily 
small in terms of the spectral parameter, 5. 

Using [8], it can be shown that the joint 
eigenfunctions, p, are microlocally determined in 
terms of the 4-Fourier integral operators, Ux, and 
certain model eigenfunctions. More precisely, 


Uo, (8, y; h) an g, c(h) em. [up Ue s Ue|(Y; b) [9] 


where m € Z--1/4,c(b) € C(h). The generalized 
eigenfunctions of the model operators, 1}, leh, le, acting 
transversely to the orbit ©, are uj(y;u, b) — 
c yy, 十 < 全 ^ a (ps Otk, 5) = 
pt giko and u(y; n, T H, (b ^y), where H,(y) is 
the P. Hermite function. 


Eigenfunction Lower Bounds: 
Quantitative Results 


Let O, be a singular rank k<n orbit as in the 
previous section. From the qualitative results of the 
first section, it follows that there must exist joint 
eigenfunctions, y,, of the commuting operators, 
P;,j=1,...,”, which blow up along the orbit, Ox. 
To obtain quantitative results, one could try to 
determine the L^ + LI mapping properties of the 
b-Fourier integral operator, U,. However, since the 
canonical transformation « to normal form can be 
complicated, this method is quite cumbersome. To 
avoid this complication (Toth and Zelditch 2003), it 
suffices to compute L*-masses only, but on scales of 
order 6° where 0<6<1/2. Let (Ge (b^ ) be the 
configuration space projection of the 5^-radius tube 


Gi (Pe ) D O,. Since 
leis Volta) > |... lo, dVol [10 
(h 


one is reduced to estimating Jo, (b^) lnl dVol from 
below. To bound this integral from below, it suffices to 


1. reduce the estimate to one involving only the 
model eigenfunctions in the Birkhoff normal 
form and 

2. estimate the normalizing -dependent constant 


c(h) in [9]. 
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To prove (1) one introduces a cutoff function 
x(x, &; b^) c Ce (G (b^)) and is identically equal to 
one near ©,. Then, since a (n(G,(b^))) »- G, (b^), 
from the Garding inequality, it follows that 


[ (b^) le, dVol > (Opp (x(x, €: HY) pn, Pu) 1 1] 


In light of the QBNF result in [8], the computa- 
tion of the matrix element on the RHS of [11] is 
reduced to a corresponding computation for the L?- 
normalized model eigenfunctions. Since the U,’s are 
microlocally unitary, it follows that 


(Ops (x(x, & b°)) Pus Pu) ~p—o+ C(d) - \c(b)|? [12] 


Here, the constant C(6)>0 depends only on the 
scale of the cutoff function. It finally remains to deal 
with (2). Bounding the size of |c(5b)| from below 
amounts to estimating the L*-mass of the joint 
eigenfunction y, which must be trapped near 
the orbit, O,. Using a local (singular) Weyl 
law argument, it is shown in Toth and Zelditch 
(2003) that 


(b) > |logb| ^ [13] 


where 3>0 indexes the number of hyperbolic and 
loxodromic model operators. The final result quan- 
tifies blow-up along a compact orbit: 


Theorem 3 (Toth and Zelditch 2003). Let ©, be a 
rank k «n orbit of the joint flow 9,. If this orbit is 
compact and nondegenerate, tben tbere exists a 
subsequence of L^-normalized joint eigenfunctions 
PN »k=1,2,..., of the OCI system P;,j —1,...,n, 
such that for any e > 0, 


(n—k/4)—e 
len, is >e XE 


By using the semiclassical scale b? log b|'/*, one 
can (slightly) improve the lower bound in Theorem 
3 to ||P, lr > A log À| " for some a> 0 (see 
Sogge et al. (2005)). 

When (M,g) is not flat, there must exist a 
singular, compact orbit of dimension k with 1<k< 
n — 1 and so, as an immediate corollary of Theorem 
3, it follows that for some a 7 0, 


L” (A; g) >> AV^|log A| ? [14] 


Since the bound in [14] is highly dependent on 
dimension, establishing the existence of high- 
codimension singular orbits would strengthen the 
estimate substantially. However, this appears to be a 
difficult and open problem. 


Maximal Blow-Up of Modes 
and Quasimodes 


We review here a number of converses to a recent 
result of Sogge and Zelditch (2002) on Riemannian 
manifolds (M,g) with maximal eigenfunction 
growth. These authors proved that if there exists a 
sequence of L?-normalized eigenfunctions of the 
Laplacian A of (M,g) whose L*-norms are compa- 
rable to zonal spherical harmonics on $”, then there 
must exist a point comparable to the north pole of 
S”, that is, a recurrent point z such that a positive 
measure of geodesics emanating from z return to it 
at a fixed time T. The most extreme kind of 
recurrent point is a “blow-down point" of period 
T, where by definition all geodesics leaving z return 
to z at time T, that is, form geodesic loops. Poles of 
surfaces of revolution are blow-down points where 
all geodesic loops at z are smoothly closed, while 
umbilic points of triaxial ellipsoids are examples of 
blow-down points where all but two geodesic loops 
are not smoothly closed. On real-analytic manifolds, 
all recurrent points are blow-down points. The 
converse question is the following: what kind of 
mode (eigenfunction) or quasimode growth must 
occur when a blow-down point exists? 

Sogge et al. (2005) proved that maximal quasi- 
mode growth (Colin de Verdiere 1977) implies the 
existence of a blow-down point. This generalizes the 
main result of Sogge and Zelditch (2002) from 
modes (which one rarely understands) to quasi- 
modes (which one often understands better). Con- 
versely, existence of a blow-down point insures 
near-maximal quasimode growth, that is, here, 
maximal up to logarithmic factors. If one assumes 
that the geodesic flow G': T*'M — T*M of (M,g) is 
completely integrable and that dim M — 2, then the 
results of Sogge et al. (2005) show that actual 
eigenfunctions have near maximal blow-up. Examples 
show that, in general, blow-up points do not neces- 
sarily cause modes to have near-maximal blow-up. 

An important geometric invariant of a blow-down 
point is the first-return map to the cotangent fiber 
over the blow-down point: 


GI:5$2M 5 SM [15] 


G! is also an important analytic invariant: the blow- 
up rate of modes or quasimodes, specifically the 
occurrence of the logarithmic factors, depends on 
the fixed-point structure of this map. When all 
geodesic loops at z are smoothly closed, that is, 
when the first-return map is the identity, then there 
exist quasimodes of maximal growth. When the 
first-return map has fixed points, the maximal 
growth is modified by logarithmic factors. 
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To put these results in context, we first recall 
the local Weyl law of Avakumovich—Levitan 
(Duistermaat and Guillemin 1975), which states that 


3 les(x)|" = (22) 


" dé+R(A,x) [16] 

WEST p(x.£)€A 

with uniform remainder bounds 
IR(A,x)| « CA"! 


It follows that 


xX€M 


L*(X,g) = O(A) [17] 


on any compact Riemannian manifold. Riemannian 
manifolds for which the equality 


L™(A,g) = (4-92) (18) 


is achieved for some subsequence of eigenfunctions 
are said to be of maximal eigenfunction growth. In 
addition to modes, and almost inseparable from 
them, are the quasimodes of the Laplacian (Colin 
de Verdiere 1977). As the name suggests, quasi- 
modes are approximate eigenfunctions. The crudest 
type of quasimode is quasimode {vy} of order 0, 
namely a sequence of L?-normalized functions 
which solve 


(A — pewellrz = OM) 


for a sequence of quasieigenvalues j4. By the 
spectral theorem, it follows that there must exist 
true eigenvalues in the interval [ji — ô, ug + 6] for 
some 6>0. (M,g) is said to have maximal 0-order 
quasimode growth if there exists a sequence of 
quasimodes of order 0 for which |lwll; = 
Q(AV-U/7). There are analogous definitions for 
more refined quasimodes, for example, quasimodes 
of higher order or (most refined) quasimodes defined 
by oscillatory integrals. It is natural to include 
quasimodes in this study because they often reflect 
the geometry and dynamics of the geodesic flow 
more strongly than actual modes. For quasimodes, 
there is the following result: 


Theorem 4 (Sogge et al. 2005). Let (M",g) be a 
compact Riemannian manifold with Laplacian A. 
Then: 


(i) If there exists a quasimode sequence (Gb, Hk)} 
of order 0 with ||wrllix = (pu, "~1)/2) then there 
exists a recurrent point z € M for the geodesic 
flow. If (M,g) is real analytic, then there exists 
a blow-down point. 

(ii) Conversely, if there exists a blow-down point 
and if the map G1 —id, then there exists a 
quasimode sequence {(v,, 144)} of order 0 with 
ullis =u). 


(ui) Let n—2 and (M",g) be real analytic. Then, 
if Gl bas a finite number of nondegenerate 
fixed points, there exists a quasimode sequence 
Aen nf order 9 wilb ell = (^ X 
|log m| ™*) 


The assumption that Gl-id is the same as 
saying that all geodesics leaving z smooth close up 
at z again. As mentioned above, poles of surfaces 
of revolution have this property. On the contrary, 
the umbilic points of triaxial ellipsoids in R? are 
blow-down points for which GI z id. That is, 
every geodesic leaving an umbilic point returns at 
the same time, but only two closed geodesics in 
this family are closed, and they give rise to fixed 
points of Gi. One can show (see Toth 1996) that 
there exists a sequence of eigenfunctions in this 
case for which L™(g,\) ~ A'?2|logA| “7. Hence, 
the above result is sharp. Moreover, it is clear 
from the proof that the fixed points are respon- 
sible for the logarithmic correction to maximal 
eigenfunction growth: they cause a change in the 
normal form of the Laplacian near the blow-down 
point. 

Theorem 4 illustrates the intimate connection 
between maximal blow-up of quasimodes and 
existence of blow-down points. It is natural to ask, 
however, when blow-down points cause blow-up in 
modes, that is, actual eigenfunctions. As mentioned 
above, this is not generally the case and some further 
mechanism is needed to ensure it. In the case of QCI 
surfaces, one can prove: 


Theorem 5 (Sogge et al. 2005). Let (M,g) be a 
smooth, compact surface, P4 = VA, P5 be an Elias- 
son nondegenerate OCI system on M and p, be an 
L?-normalized joint eigenfunction of P,,P2 with 
VAyp=Apyp. Suppose that there exists a blow- 
down point z€M for the geodesic flow 
G,:— exptX,,. Then, there exists a subsequence of 
(joint) Laplace eigenfunctions, pj,,k =1,2,..., such 
that for any e > 0, 


1/2 
lins ineo Be 777 


The role of complete integrability is to force joint 
eigenfunctions to localize on level sets of the 
moment map and thus to blow up at blow-down 
points. The proofs of Theorems 4 and 5 are similar. 
To prove the latter, by the same reasoning as in the 
orbit case (Theorem 3), one needs to bound from 
below the integral 


| le, d Vol 19] 
B(z:h’) 
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for an appropriate subsequence of wp,s, where B(z; b^) 
denotes a ball of radius P^ centered at the blow-down 
point, z € M. The blow-down condition implies that 
S&M C P-(b) for some b €B. The relevant sub- 
sequence of eigenfunctions, y,,, are the ones with 
joint eigenvalues satisfying |ju() — b| = O(b). Since the 
eigenfunctions y, are microlocally concentrated on the 
set P^! (b), by Garding, 


| ^) lo, d Vol > (Ops (x(x, E: b yo, Pp) [20] 
zi 


where x(x,£,b^) is a cutoff localized on an 
b^ neighborhood of Q=r (z) n P^! (b). The matrix 
elements on the RHS of [20] are estimated by passing 
to QBNF. The subtlety here lies in the choice of scale, 
6. For 0 < 6 < 1/2, the 5-pseudodifferential operators 
Op; (x(x, £; b^)) are contained in a standard calculus 
(Dimassi and Sjoestrand 1999) and so they automati- 
cally satisfy the b-Egorov theorem. In particular, the 
passage to normal form by conjugating with the U,’s 
is automatic. The crucial point here is that to obtain 
the (near)-maximal blow-up near a blow-down point 
z € M, one needs to able to choose 0 < ó < 1. Using 
second-microlocal methods similar to the ones in 
Sjoestrand and Zworski (1999), it is shown in Sogge 
et al. (2005) that the blow-down geometry implies that 
the microlocal cutoffs are contained in an 5-pseudo- 
differential operator calculus and, in particular, the 
relevant b-Egorov theorem needed to pass to QBNF is 
satisfied for any 0 < 6 « 1. Then, by explicit compu- 
tation for the model eigenfunctions, one can show that 


Op; (x (x, &; b^), Pu) Ss b^ [21] 


for any 6 with 0<é<1. The result in Theorem 5 
then follows from the bound 


lel + Vol(B(z; b?) > h’ [22] 


where one takes 6 arbitrarily close to 1. By analyzing 
the U,s carefully (Sogge et al. 2005), the lower 
bound in Theorem 5 can be improved slightly by 
replacing the A * by |logA| | for some o > 0, 
although the sharp constant, a > 0, appears to be 
difficult to determine in general. In cases where the 
geometry of the first-return map, G1, is particularly 
simple, one can sometimes get sharp |log A|-power 
improvements in Theorem 5 (see Theorem 4 (iii)). 


Eigenfunction Upper Bounds: 
Quantitative Results 


In light of the Q-bounds in Theorem 5, it is natural 
to ask whether there are analogous upper bounds 
for L*(A;g) in the QCI case. The following result 
holds in the case of real-analytic surfaces: 


Theorem 6 (Sogge et al. 2005). Let (M,g) be a 
real-analytic Riemannian 2-manifold and P4— VA 
and P; be a OCI system on (M,g) where, the 
principal symbol, p2, of P5 is a metric form on T* M. 


(i) If Mz TŻ, 
L* (sg) = O(A) 


(ii) If M S S?, let Mec be the set of completely 
recurrent points for the geodesic flow, 
G;:T*M—T*M and let Qc C M be an open 
neighborhood of M,«. Then, 


L"(A:g)u-o,, = O(A! $) 


An old result of Kozlov says that if the surface 
(M,g) is analytic, then topologically either M = S? 
or M & T", so that the estimates in Theorem 6 cover 
all possible cases in two dimensions. The assump- 
tions in Theorem 6 are satisfied in many examples 
including surfaces of revolution, Liouville surfaces, 
and ellipsoids with distinct axes in R?. 

The proof of Theorem 6 follows from a pointwise 
(joint) trace formula argument (Duistermaat and 
Guillemin 1975). Namely, in Sogge et al. (2005), it 
is shown that if there are no blow-down points for 
G, then for appropriate p € S(R) with p > 0 and 
je CR(R), 


Y p(B [ui b) — b1]) - o(b7* [by — ba]) 


x leu; b) = op ^^) [23] 


where the estimate in [23] is uniform in x € M and 
locally uniform in b= (bi, b5) € B. Part (ii) follows 
from this. To prove part (i), one applies a simple 
homological argument to show that if M 7 T?, there 
cannot exist blow-down points for the geodesic flow 
(see also Sogge and Zelditch (2002)). 


Open Problems 


Most questions related to eigenfunction blow-up are 
completely open and general results are rare (Sogge 
and Zelditch 2002). Specific results/conjectures in 
the ergodic case can be found in Quantum Ergodi- 
city and Mixing of Eigenfunctions. We would like to 
point out here some specific questions related to the 
above results in the QCI case: 


1. All the known examples with blow-down points 
turn out to be integrable. Is this necessarily 
always the case? 

2. Does the maximal bound L™(A;g) ~ AL/ 
necessarily imply that (M,g) is QCI? 


3. At the other extreme, does the minimal bound 
L*(A;g) ^ 1 necessarily imply that (M, g) is flat, 
or do there exist nonflat manifolds (which are 
necessarily not QCI) satisfying L?*(A;g) ~ 1? 
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Introduction 


The goal of statistical mechanics is to calculate the 
macroscopic properties of matter from a knowledge 
of the fundamental interactions between the con- 
stituent microscopic components. For simplicity, let 
us assume discrete states. The mathematical prob- 
lem, as formulated by Gibbs, is then to calculate the 
partition function 


ZN = >». eg le) [1] 

States g 
where G=1/kpT is the inverse temperature, kp is 
the Boltzmann constant, and the Hamiltonian H 
describes the interaction energy of the state o of the 


N constituent degrees of freedom. The formidable 
nature of the problem ensues from the fact that ZN 
is needed in the limit of an arbitrarily large system 
to obtain the bulk free energy Y(T) or partition 
function per site « in the thermodynamic limit 


-&W(T)- lim log Zw=logx [2] 


This limit generally exists because the free energy of a 
finite system is extensive, that is, it grows proportion- 
ally with the system size. Once the bulk free energy is 
known, the other thermodynamic potentials are 
obtained, in principle, by taking derivatives with 
respect to the temperature T and other thermodynamic 
fields such as the volume V or the external magnetic 
field 5. Phase transitions and the accompanying critical 
phenomena are associated with singularities of the 
bulk free energy as a function of the thermodynamic 
fields. Up until the beginning of the 1970s, there were 
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only a handful of two-dimensional lattice models that 
had yielded exact solution, most notably, the Ising 
model (free-fermion or dimer model), the spherical 
model, the square ice, and six-vertex models. This 
situation changed dramatically with Baxter's solution 
of the eight-vertex and hard-hexagon models. The 
methods developed by Baxter make it possible to solve 
an infinite plethora of two-dimensional lattice models. 
In this article, we compare and contrast the remark- 
able properties of these two prototypical models that 
played such a pivotal role in the emergence of the 
modern theory of Yang-Baxter integrability. 


Definition of the Models 
Eight-Vertex Model 


The eight-vertex model emerged from the study of 
two-dimensional ferroelectrics. The local degrees of 
freedom are arrow states o, 3,7,6= +1 which live 
on the edges of the elementary faces of the square 
lattice and describe the local polarization within the 
ferroelectric material. Of the 16 possible configura- 
tions around a face, the local configurations of an 
elementary square face are restricted to the eight 
configurations shown in Figure 1. 
The partition function is 


" 
Zw = ]l[w(5 8 i3] 


arrow states faces Qr 


where the Boltzmann face weights are given alter- 
native graphical representations as a face or vertex 


y 7 
" 
T je Je [4] 


Figure 1 The eight vertex configurations of the eight-vertex 
model showing one of the two corresponding configurations of 
the related Ising model. The model is solvable in the symmetric 
Case, wy =Ws5, We = We, W3 =W7,W4 =wg, When the Boltzmann 
weights are equal in pairs under arrow reversal. 


In the face representation, the arrow states are often 
called bond variables. Formally, the Hamiltonian is 
a sum over local energies H= J nce Ela, 3,7, 6), 
where W(a, B,"y,6) — exp(— BE(a, 3, y, 6)) but we use 
face weights since E is infinite for excluded config- 
urations. The general eight-vertex model includes 
many other ferroelectric models including the 
rectangular Ising model, Slater's model of potassium 
dihydrogen phosphate (KDP), the Rys F model of an 
antiferroelectric, the square ice model and the six- 
vertex model solved by Lieb. In the case of the six- 
vertex model, w4 — ug =0, so arrows are conserved 
with “two in” and “two out” at each vertex. 

The eight-vertex model can be formulated as an 
Ising model with spins a, b,c, d= +1 at the corners 
of the elementary faces and Boltzmann face weights 


d 
W 4 * R exp(Kac + Lbd + Mabcd) 
a b d 
d C 
= 一 - = a C [5] 
a b 
b 


The four independent vertex weights are related to 
R, K,L, M by 


wi = ws = ReK+L+M 
w = we = Re KEM é 
W3 三 Wy = Ret tM | | 
w4 = Wg = Re 50M 


This is not the usual rectangular Ising model since it 
involves four-spin interactions in addition to two-spin 
interactions. The spins and arrows are related by 


q-udb, B=b, y= cd; | 6=da [7] 


This mapping is one-to-two, since we can arbitrarily 
fix one spin somewhere on the lattice. It follows that 
Zising =2Z vertex. The eight-vertex model obviously 
includes the six-vertex (wa =wg) and the rectangular 
Ising models (M—0). Although it is not at all 
obvious, the three-spin Ising model is also included 
as a special case (K = M, L — 0). 

Notice that the eight-vertex face weights are 
invariant under spin reversal of the spins on either 
diagonal. This Z2 x Z2 symmetry, which the eight- 
vertex model shares with the Ashkin- Teller model, 
is peculiar because it allows the model to exhibit 
continuously varying critical exponents. Because of 
symmetries and duality, it is sufficient to consider 
the regime wi > w2 +w3 +w4 with w2, w3, w4 > 0. 
In terms of spins, this corresponds to the 


ferromagnetically ordered phase; in terms of vertices 
or arrows, this corresponds to the ferroelectric 
phase. The eight-vertex model is critical on the 
four surfaces 


Q1 = 2 +W3+W4, Q2 = Wy H w3 Hwg 


[8] 
W3 = Q1 +w: + 4, w4 = Q1 二 Ww + W3 

A convenient parameter to measure the departure 
from criticality t=(T — Te)/Te is 


] 


t= 一 一 -一 -一 一 
16w1w2Ww3wW4 


[(w1 — w2 — w3 一 w4) 


X (Wy — w + w3 + w4) 
X (wy 十 wa — w3 + w4) 
x (wy 十 wz +w3 —wa)] [9] 


Because of the unusual four-spin interaction, it is 
difficult to realize the eight-vertex model experi- 
mentally in the laboratory. 


Hard-Hexagon Model 


The hard-hexagon model is a two-dimensional 
lattice model of a gas of hard nonoverlapping 
particles. The particles are placed on the sites of a 
triangular lattice with nearest-neighbor exclusion 
so that no two particles are together or adjacent. 
Effectively, the triangular lattice is partially cov- 
ered with nonoverlapping hard tiles of hexagonal 
shape. Let us draw the triangular lattice as a 
square lattice with one set of diagonals as in 
Figure 2. The partition function for the hard- 
hexagon model is 


ZN = 》_z"g(n,N) [10] 


where z > 0 is the activity and g(n, N) is the number 
of ways of placing n particles on the N sites such 
that no two particles are together or adjacent. To 
each lattice site j, assign a spin or occupation 


SENSN 


Figure 2 The triangular lattice drawn as a square lattice with 
one set of diagonals. The close-packed arrangement of particles 
(solid circles) fills one of the three independent sublattices. One 
of the nonoverlapping hard hexagons is shown shaded. At low 
activities, the hard hexagons are sparsely scattered on the 
lattice with no preferential occupation of a particular sublattice. 
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number oj; if the site is empty, c; — 0; if the site is 
full, oj=1. The partition function can then be 
written in terms of spins as 


ZN = >. Lier a = 7j0;) pen 


spins o (ij) 


where the product is over all bonds (ij) of the 
triangular lattice and the sum is over all configurations 
of the N spins or occupation numbers c; — 0, 1. The 
exponent of z arises because the activity is shared out 
between the six bonds incident at each site. The 
remaining term, (1 — cjo;) — 0, 1, ensures that neigh- 
boring sites are not occupied simultaneously by 
excluding such terms from the sum. 

The activity z gives the a priori probability of 
finding a particle at a given site and can be written 
as z=e Ht, where p is the chemical potential. The 
density of particles increases monotonically as the 
activity increases but only a third of the total lattice 
sites can be occupied. At low activities, there are 
only a few particles scattered randomly so the S3 
sublattice symmetry of the triangular lattice is 
preserved. However, at higher activities approaching 
the close-packing limit, there is a sudden change and 
one of the three sublattices is preferentially occupied 
so the $, sublattice symmetry is spontaneously 
broken. This dramatic change signals an order- 
disorder phase transition at some critical value z. of 
the activity. The system is disordered below the 
critical activity but is ordered above it. The funda- 
mental problem is to obtain the statistical properties 
of this model such as the bulk free energy and the 
sublattice densities 


pk = (ox) 
= {fraction of spins sitting on 
sublattice k = 1,2,3} [12] 


in the thermodynamic limit N — oo. The mean 


density is 


p= (Pi pz p3)/3 < 1/3 [13] 


Assuming that sublattice k=1 is preferentially 
occupied, an order parameter is defined by 


R =p -p [14] 


The order parameter vanishes in the disordered regime 
but is nonzero in an ordered regime. Notice that the 
symmetry between sublattices k = 2 and 3 is not broken. 

Unlike the eight-vertex model, the hard-hexagon 
model can be realized by a physical system in the 
laboratory, namely helium adsorbed on a graphite 
surface. The graphite substrate is composed of 
hexagonal cells formed by six carbon atoms with 
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an interatom distance of 2.46 A. Energetically, the 
adsorbed helium atoms prefer to sit in the potential 
well at the center of the hexagonal cells. The 
diameter of the helium atom, however, is 2.56 A, 
which precludes the simultaneous occupation of 
neighboring cells by excluded volume effects. Some 
beautiful experiments carried out by Bretz indicate 
that this system undergoes a phase transition. 
Indeed, Bretz took precise measurements of the 
specific heat as the temperature or, equivalently, the 
activity z, is varied, and obtained a symmetric 
power-law divergence at the critical point 


Celz—zl, «£20.36 [15] 


with critical exponent a close to 1/3. Of course, one 
does not actually see divergences experimentally. 
Rather, it is the presence of dramatic peaks in the 
specific heat that are the hallmarks of a second- 
order transition. 


Yang-Baxter Equations and Commuting 
Transfer Matrices 


Yang-Baxter Equations 


The eight-vertex and hard-hexagon models were 
solved by Rodney Baxter at the beginning of the 
1970s and 1980s, respectively. Although the two 
models are quite different in nature, they are 
quintessential of exactly solvable lattice models. 
The seminal work of Baxter gives a precise criterion 
to decide if a two-dimensional lattice model is 
exactly solvable: it is exactly solvable if its local 
face weights satisfy the celebrated Yang-Daxter 
equation. We present a general formulation of the 
Yang-Baxter equations and commuting transfer 
matrices and then show how Baxter implemented 
these for the eight-vertex and hard-hexagon models. 
The first important step in the exact solution of a 
two-dimensional lattice model is the parametrization 
of the Boltzmann weights in terms of a distinguished 
variable u called the spectral parameter. Typically, 
critical models involve trigonometric or hyperbolic 
functions and off-critical models involve elliptic 
functions of the spectral parameter. In terms of u, 
the local Boltzmann weights of a general two- 
dimensional lattice model take the form 


de dc 
WI 6 Bu ZEE [16] 
a Q b i à b 


where the allowed values of the spins a, b,c,... and 
arrows (or bond variables) a,{,y,... may be 


restricted by certain constraints. The spins a,b, c, d 
are absent for the eight-vertex model and the arrows 
a, B,^y, are absent for the hard-hexagon model. 

The general Yang-Baxter equations take the 
following algebraic and graphical forms: 


f Cg e ôd d ge 
yw 7 Jv € < Jv 3 B 
genet \a a b AEGE. g nb 


e ng edd 
5 yw(: $ | |: Y | 
gue 


Ere G f H a 


g ce 
v-uj W| £ B 
aab 


Graphically, this equation can be interpreted as 
saying that the diamond-shaped face with spectral 
parameter v — 4 can be pushed through from the 
right to the left with the effect of interchanging the 
spectral parameters u and v in the remaining two 
faces. 


Commuting Transfer Matrices 


A square lattice is built up row-by-row using the 
row transfer matrix T(u) with matrix elements 


(a, o| T(u) le, 7) 


N Cj ^ +t - 
- X lws an 45 
Boli B S1 771 d; a; Aix 
Ci Yı C2 NM C3 V3 C4 CN ^4 C1 


lá popogi a m 


ay Ct] a> Q5 a; Q3 as Ay; & ay 


Here there are N columns, and periodic boundary 
conditions are applied so that ani; =41,0N41=/1, 
and so on. The significance of the Yang—Baxter 
equations is that they imply a one-parameter family 
of commuting transfer matrices 


T(u)T(v) = T(v)T(u) 20] 


Pictorially, the product on the left is represented by 
two rows, one above the other, the lower row with 
spectral parameter u and the upper row with 
spectral parameter v. The matrix product implies 
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that the spins and arrows on the intervening row are 
summed out. Inserting a diamond-shaped face with 
spectral parameter v — 4 and then using the local 
Yang-Baxter equation to progressively push it from 
right to left around the period interchanges all of the 
spectral parameters u with the spectral parameter v. 
At the end, the diamond-shaped face is removed 
again. This heuristic argument was made rigorous 
by Baxter, who showed quite generally, and for the 
eight-vertex and hard-hexagon models in particular, 
that the diamond faces are in fact invertible: 


60! 


gE 


independent of b, d where the scalar function p(u) is 
model dependent. This equation is called the 
inversion relation. 

Invariably, the existence of commuting transfer 
matrices leads to functional equations satisfied by 
the transfer matrices. Typically, the transfer matrices 
can be simultaneously diagonalized and so the 
functional equations can be solved for the eigen- 
values of the transfer matrices. Mathematically, this 
is where Yang-Baxter techniques derive their power. 
For example, building up the lattice row-by-row, we 
see that the partition function of an M x N lattice is 


ap Ti [22] 


where T,(u) are the eigenvalues of T(u). Typically, 
by the Perron—Frobenius theorem, the largest eigen- 
value To(u) is real, positive, and nondegenerate: 


To(u) > |Ti(u)| 2 |T2(u)| 2 -+ [23] 


Consequently, 


= p(u) A c) nn 


ZMN = tr T(u 


—wv = lim lim M8 Tale) 


N—oc M—oc M 
= lim log To(u) [24] 


Thus the calculation of the bulk free energy is 
reduced to the problem of finding the largest 
eigenvalue of the transfer matrix. 


Parametrization of the Eight-Vertex Model 


Using the spin formulation of the eight-vertex 
model, Baxter showed that two transfer matrices 
T(K, L, M), T(K', L', M') commute whenever 


A(K, L, M) = A(K', L', M) [2.5] 


where 


A(K, L, M) = sinh 2K sinh 2L 
+tanh2Mcosh2Kcosh2L . [26] 


If M and A are regarded as fixed, this is seen to be a 
symmetric biquadratic relation between e?* and e?- 
and is naturally parametrized in terms of elliptic 
functions. Unfortunately, many different notations 
and conventions for these elliptic functions appear 
in.the literature which can be confusing to the 
uninitiated. Let 


E/O —*(Atu) | — 910) 
=h SA "= 320) 2d 
u Üa(u) = 04(A = u) m 04(0) 
ot) 7€ 3» ^ P^ XQ) 28) 


where (u) =V (u, q) and J4(u) = 94(u, q) are stan- 
dard elliptic theta functions of nome q. Then the 
vertex weights can be parametrized as 


= Ru cc_， w = Rnp !ss. 


[29] 
= Ry 'cs_, wa = Ru !c.s 
In the ferromagnetic regime u, À, and 7 are all pure 
imaginary with O<g<1 and O0O<Imu<ImA< 
(1/2)Im v. The critical line occurs in the limit q — 1. 
In this sense, we are using a low-temperature elliptic 
parametrization. Another elliptic parametrization, 
which is useful to study the critical limit, is obtained 
by transforming to the conjugate nome q’: If q—e ™ 
then the conjugate nome is defined by q'— e^"/* so 
that g’ — 0asq — 1. 

We regard the crossing parameter À as constant, u 
as a variable, and write the transfer matrix as T(u). 
It follows from this parametrization that M and A 
are constants, independent of u. Furthermore, any 
two transfer matrices T(u) and T(v) commute and 
hence T(u) is a one-parameter family of commuting 
transfer matrices. For interest, we point out that the 
integrable XYZ quantum spin chain belongs to this 
family. Specifically, the logarithmic derivative of the 
eight-vertex transfer matrix yields 


£ [log To), o Hxvz 30) 


where 


Hxvz 
i 
-32 (oF 02,1 Jy Ta + J207 41) [31] 
p 


and oxo) ,0; are the usual Pauli spin matrices. 
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Parametrization of the Hard-Hexagon Model 


Actually, Baxter did not solve the hard-hexagon 
model directly. Instead, he solved a generalized 
hard-hexagon model, which is a model of hard 
squares with interactions along the diagonals of the 
elementary squares as shown in Figure 3. This in 
turn corresponds to the A4 case of the more general 
solvable A, restricted solid-on-solid (RSOS) models 
of Andrews, Baxter, and Forrester. 

The face weights of the generalized hard-hexagon 
model are 


v(' 4 = pig erat ab epi] a ab) 
a 


x (1 — bc)(1 — ed)(1 — da) 
x exp(Lac + Mbd) [32] 


Here the activity z has been shared out between 
the four faces adjacent to a site, m is a trivial 
normalization constant, and £ is a gauge param- 
eter that cancels out of the partition function and 
transfer matrix. The anisotropy between L and M 
introduces an additional parameter which will 
play the role of the spectral parameter u. In fact, 
using the Yang-Baxter equation, Baxter showed 


that this model is exactly solvable on the 
manifold 
z—(1—e)(1—eM)/(e-*M eL eM) = [33] 


Specifically, two transfer matrices T(z, L, M) and 
T(z’, L', M') commute whenever 


A(z, L, M) = A(z’, L', M) 


34 
A(z, L, M) = z !^(1 — ze) B^ 


The hard-hexagon model is recovered in the limit 
L —0, M = 一 oo which forbids simultaneous occupa- 
tion of sites joined by one set of diagonals. In this 
special limit, the activity z is unconstrained. It is 
curious to note that the pure hard-square model 
with L=M — 0 is not solvable. 

Eliminating z between the above relations gives a 
symmetric biquadratic relation between e^ and eM, 


Figure 3 Interacting hard squares showing the diagonal 
interactions L and M. The hard-hexagon model corresponds to 
the limit L — 0, M = —oc 


which is naturally parametrized in terms of elliptic 
functions. Choosing m and f£ appropriately, the 
Boltzmann weights are 


vo o) = ma 


wi 4 E wo gU 


wo 1)= wo 0) 侈 
v(i 4 =a 


1 0\ A@AA+u4) 
w (0 M 


Here the crossing parameter is A— 7/5, —A < u < 
2A, and 


O(u) — 6(u, q^) 
= qg” gu 
oe a 
“(i= gre \(1 = q^") [36] 


is a nonstandard elliptic theta function of nome q7. 
Despite the deceiving notation, the nome q? lies in the 
range —1 < q? < 1 and is determined by the relation 


90) ] +My? 
AT = — g(1 = ze^* M 37 
mecum or 
Regarding q^ as fixed and u as a variable, it follows 
that T(u) is a one-parameter family of commuting 
transfer matrices. 

The regimes relevant to the hard-hexagon model 
are: 


Regime I (disordered) : —] g^ «0, 
—A«uc0 
| | : 38] 
Regime II (triangular ordered): 0 « q^ « 1, 
—A<u<Q 


The borderline case q? —0 corresponds to a line of 
critical points. The original hard-hexagon model is 
obtained in the limit u 一 —\A\= —7/5, so it follows 
that the critical point occurs at 


zc = (i ; (114-5) [39] 


Away from criticality the activity is related to the 
nome q^ by 


=z [Ili — Dg "R g^ i 40] 
4^1 = 2q*" cos(2/5) 


Functional Equations 
Baxter’s T—Q Relation 


In a tour de force Baxter showed that the transfer 
matrix of the eight-vertex model satisfies the 
functional equation 


T(u)Q(u) = o(u)Q(u — A) + (u — A)O(u +à) [41] 


where ó(u)- (cs)" — [01(u)94(u)/01(A)04(A)]" and 
O(u) is an auxiliary family of mutually commuting 
transfer matrices satisfying [Q(z), Q(v)] ^ [Q(z) 
T(v)| 20. In principle, these equations, which are 
intimately related to the Bethe ansatz, can be solved to 
obtain all the eigenvalues of the transfer matrix. 
Without entering into the intricacies of solving these 
equations, we summarize the results for the partition 
function per site x, correlation length £, and interfacial 
tension g. As we have seen, the largest eigenvalue of 
the transfer matrix yields x. The interfacial tension c 
and correlation length € were obtained, respectively, 
by Baxter and by Johnson, Krinsky, and McCoy by 
integrating over (continuous) bands of eigenvalues. In 
the ferromagnetic regime, their results are 


«dong 
E 50 - -n(x in 一 q" 2 (x^! +7? tð og" [42] 
E n(1 — a?) + x) 
6 —-jlogko^) o=ksT/é [43] 


where x=e™/2, z=x71e™, 


modulus of nome x?: 
k(x?) = 4x I ae of X47— =z) [44] 


Detailed analysis shows that near Te the free 


and k is the elliptic 
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and a=2—7/ji with 0 < ji < r. Exceptional cases 
occur, however, if x/j is an integer. This occurs, for 
example, in the case of the rectangular Ising model 
(M—0,j] —7/2), which exhibits a logarithmic sin- 
gularity in the specific heat (a —0j,,). Similarly, 
using log k(x?) ~ (—t)"/74, the other associated crit- 
ical exponents are 
~, e-(-t), v-u-n[2g [47 
Notice that, due to the special symmetries of the 
eight-vertex model, these critical exponents vary 
continuously as the four-spin interaction is varied. 
This violates the universality hypothesis, which 
asserts that the exponents should only depend on 
the dimensionality and symmetries and not on the 
details of the interactions. Suzuki has suggested that 
it is more natural to use the inverse correlation length 
£1, rather than the temperature difference T — Te, to 
measure the departure from criticality with the effect 

that it is the renormalized critical exponents 
à = (2 — a)/v, 


B=B/v, fp=p/v [48] 


that are independent of the details of the 


interactions. 


Hard-Hexagon Functional Equation 


Baxter and Pearce showed that the normalized row 
transfer matrix of the generalized hard-hexagon 
model, 


Olu + 2A)0() 


N 
Wu 39(«-23)| 2%) I 


t(u) — 


satisfies the simple functional equation 


energy w in general behaves as t(u)t(u + A) =I 4+ t(u — 22) [50] 
V ~ cot(1^/24)t"^ 1. 10 [45] — where A — q/5. Since T(u) is a commuting family of 
where t=(T — T.)/T. matrices, this equation can be solved for the 
cf ^ € eigenvalues T(u) to obtain the partition function 
1/2) = 1/2 _ .2M 4 per site «, correlation length £, and interfacial 
tanaj = [piel gun) : iul tension c. Let p —|q?|, s— |g?|/5, then the results 
are summarized as 

x, TTL 3a "cosi S) tg | Sap (hp Np ye 

: 14 [1 — 2q?” cos(27/5) + q*^]3 (1 — p51/3)3(1 — p102n-1)) à p 
- 2 4n)2 2 5 51 

x 1 m 2 n 4 5 二 n {i Zn 1 = an 
«TI q^" cos(47/5) + a] (1 - a^^ (1 — p?) EM 


n=1 


[1 — 247^ cos(2a,/ 5) + q"^]5 (1 
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1/2 
27(25 + 11V) 
Kad ie NN [52] 
一 上 n=1 1 十 v/3s2n-1 + s4n-2 . IX &C 
n i 2 =<. — 2527-1 COS (72 - a) na ç4n-2 E N 
Geaen Sa 1 — 2s?"-1 cos(# +a) 4 s4n-2 Z >Re 
[53] 
| 0， 名 < Ze 54) 
'" = 
lbk&T/6, 2 > X 


It follows that «(z), €(z), and o(z) are analytic 

functions of z, except at the critical point z= Ze. 
The associated critical exponents 

y, beige" 

a=1/3, v=p=3/6 


P [55] 

G n (z Es Zc) i E 
agree with experiments on helium adsorbed on 
graphite. 


Corner Transfer Matrices 


The one-point functions and order parameters of the 
eight-vertex and  hard-hexagon models were 
obtained by Baxter by using corner transfer matrices 
(CTMs). The idea is to build up the square lattice 
quadrant-by-quadrant as shown in Figure 4. The 
partition function and one-point function are then 


tr SABCD 


Z = tr ABCD, (71) = ABCD 


[56] 
where S is the diagonal matrix with entries Ss, o = 00 
and the entries A, » are labeled by half-rows of 
spins Á& —(09,01,02,...) and o —(09,05,05, ...). 


Figure 4 The square lattice divided into four quadrants 
corresponding to the CTMs A, B, C, D. The spin at the center 
is co.. The spins on the boundaries are fixed by the boundary 
conditions. 


The CTMs have some remarkable properties. If the 
Boltzmann weights are invariant under reflections 
about the diagonals, as is the case for the eight-vertex 
model, Baxter argued that, in the limit of a large lattice, 


A(u) = C(u) = B(A — u) = D(A — u) [57] 


where A(u) is a commuting family of matrices. Since 
these are block matrices in the center spin oo, they 
also commute with $. Moreover, Baxter showed that 
the eigenvalues of A(u) are exponentials of the form 


A(u), = m, exp(uE,) [58] 


where the constants m, and E, can be evaluated in 
the low-temperature limit. It follows that 


4 a2 AE; 
> g JOM Ee 
4e2XE, 
à, Moe 
When the Boltzmann weights do not exhibit symme- 


try about the diagonals, which is the case for hard 
hexagons, the above arguments need to be modified. 


(00) — [59] 


One-Point Functions of the Eight-Vertex Model 


For the eight-vertex model, Baxter showed that 


L xx. 
iH, = 1, E, = 7i» ,iH(o1,0% ojs) 
$1 
H(aj-1,0j,0j41) = 1 — oj-19}41 [60] 


subject to the boundary condition oN 三 CN+1 王 十 1. 
Introducing a new set of spins 


eh i ee (61] 


Hj = 0;—10;j41: 


we have o9 — pi papis .... Setting s — (xz)!/^ = eriu/2 


t = (x/z)! 7 — e? ^-9/? and taking the limit of large 
N, the diagonalized matrices are direct products of 
2 x 2 matrices: 


EDD 


A(u) = C(u) 
-G eG eG 2)e- e 
B(u) = D(u) 


6 9*6 3«Q 2e w 


It follows that the magnetization is 


| didi 1\1/4 2\1/8 
(00) = IL 0 —-(1-k)" [65] 
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where k' = k’(x*) is the conjugate elliptic modulus of 
nome x? and the associated critical exponent is 


(co) ~ (Ct, B=7/16R [66] 


The polarization of the eight-vertex model is 


oo 1—x?^"1- n\ 2 
(a) = (0901) = Him 一 T) [67] 


n=] 


This cannot be obtained by a direct application of 
CTMs but was conjectured by Baxter and Kelland 
and subsequently derived by Jimbo, Miwa, and 
Nakayashiki using difference equations. 


One-Point Functions of the Hard-Hexagon Model 


For hard hexagons, the working is more complicated 
because one must keep track of the sublattice of the 
central spin oo, but fascinating connections emerge 
with the Rogers-Ramanujan functions: 


- 1 
66) = ]a-z9a 5 


n] 


. | 68} 

H(x) 7 I] (1 = x?-3)(1 = x2) 

For hard hexagons, Baxter showed that 
SAB) Eor w" gg 


rB =) re wee 


where k= 1,2,3 labels the sublattice of the trian- 
gular lattice. Here the spin configurations 
o =(00,01,03,..-) with o; —0,1 are subject to the 
constraint 0;0;,; =0 for all j. If |q*|=e-* and 
g(x) = H(x)/G(x) then 


x--e7 D. =—x/g(x), wo=-x forza. 


x =e me re =x g(x), wo x for z 2. 


[70] 
and 
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i eS 71 
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For large N, o; — sj, where the ground-state values 
s; determined by the boundary conditions are 


Z> £c:  S3jk— 1 S3j+k+1 = 0, k = 1,2.3 [73] 


After applying some Rogers-Ramanujan identities 
and introducing the elliptic functions 


Qe a * 
uu x _ nl 


the expressions for the sublattice densities simplify 
in the limit of large N giving 
_ xG(x)H(x®)P(x*) 


P(x?) 5 SSX [75] 


pp = f2 = p3 = 


in the disordered fluid phase and 
_ He99G9[GG)Q(x) + x Hlx )Q(x )] 


Qt 
RHH) si 
PACTI as re a wi. F 
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in the triangular ordered phase. In principle, the 
dependence on x can be eliminated by observing that 


2d rim 
x [G()/HG9J". 


SZ 
ne [78] 
Z > Ze 


In practice, this is quite nontrivial. Although it is far 
from obvious, because x — 1 is a subtle limit, the critical 
exponent associated with the order parameter R is 


R ~ (z-z) ~ (f, B=1/9 [79] 


Summary 


Baxter’s exact solutions of the eight-vertex and 
hard-hexagon models have been reviewed. These 
prototypical examples clearly illustrate the mathe- 
matical power and elegance of commuting transfer 
matrices and Yang-Baxter techniques. The results 
for the principal thermodynamic quantities, includ- 
ing free energies, correlation lengths, interfacial 
tensions, and one-point functions, have been sum- 
marized. For convenience in comparison, the asso- 
ciated critical exponents are collected in Table 1. All 
these exponents confirm the hyperscaling relation 
2 — o — dv for lattice dimensionality d — 2. 

More recently, Yang-Baxter techniques have been 
applied to solve an infinite variety of lattice models in 
two dimensions. Commuting transfer methods have 
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Table 1 Comparison of the exactly calculated critical expo- 
nents of the rectangular Ising, eight-vertex and hard-hexagon 
models. The rectangular Ising model corresponds to the special 
case ji—7/2 of the eight-vertex model. The eight-vertex 
exponents vary continuously with O<ji<7a.. The critical 
exponents of the hard-hexagon model, with its S4 symmetry, 
lie in the universality class of the three-state Potts model. 


Model a B » T 


Rectangular Ising Olog 1/8 1 1 
Eight vertex 2—7/L r/16ü r/2i r/2ü 
Hard hexagons 1/3 1/9 5/6 5/6 


also been adapted to study integrable boundaries and 
associated boundary critical behavior. Lastly, it 
should be mentioned that, in the continuum scaling 
limit, there are deep connections with conformal field 
theory and integrable quantum field theory. On the 
one hand, the lattice can often provide a convenient 
way to regularize the infinities that occur in these 
continuous field theories. On the other hand, the field 
theories can predict and explain the universal proper- 
ties of lattice models such as critical exponents. 


See also: Bethe Ansatz; Boundary Conformal Field 
Theory; Hopf Algebras and g-Deformation Quantum 
Groups; Integrability and Quantum Field Theory; 
q-Special Functions; Quantum Spin Systems; 
Two-Dimensional Ising Model; Yang-Baxter Equations. 
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introduction 


Even in a linear theory like Maxwell’s electrody- 
namics, in which sufficiently general solutions of the 
field equations can be obtained, one needs a good 
sample, a useful kit, of explicit exact fields like the 
homogeneous field, the Coulomb monopole field, the 
dipole, and other simple solutions, in order to gain a 
physical intuition and understanding of the theory. In 
Einstein’s general relativity, with its nonlinear field 
equations, the discoveries and analyses of various 
specific explicit solutions revealed most of the 
unforeseen features of the theory. Studies of special 
solutions stimulated questions relevant to more 
general situations, and even after the formulation of 
a conjecture about a general situation, newly dis- 
covered solutions can play a significant role in 
verifying or modifying the conjecture. The cosmic 
censorship conjecture assuming that “singularities 
forming in a realistic gravitational collapse are hidden 
inside horizons” is a good illustration. 

Albert Einstein presented the final version of his 
gravitational field equations (or the Einstein’s 
equations, EEs) to the Prussian Academy in Berlin 
on 18 November 1915: 


1 87G 
ee 7 Sw = a iw [1] 


Here, the spacetime metric tensor g(x”), j1, v, 
p,...=0,1,2,3, determines the invariant line element 
g =) dx" dx", and acts also as a dynamical variable 
describing the gravitational field; the Ricci tensor 
Rv = 8?" R puov, where g"^g,, —ó', is formed from the 
Riemann curvature tensor R,,5,,; both depend non- 
linearly on gag and „gag, and linearly on 9,0,g,; the 
scalar curvature R =g" Rw. T(x”) is the energy- 
momentum tensor of matter (“sources”); and Newton’s 
gravitational constant G and the velocity of light c are 
fundamental constants. If not stated otherwise, we use 
the geometrized units in which G = c = 1, and the same 
conventions as in Misner et al. (1973) and Wald (1984). 
For example, in the case of perfect fluid with density p, 
pressure p, and 4-velocity U”, the energy-momentum 
tensor reads T, — (p +p) U,U, + pg,,. To obtain a 
(local) solution of [1] in coordinate patch {x?) 
means to find “physically plausible" (i.e., complying 
with one of the positive-energy conditions) functions 


p(x’), p(x’), U(x’), and metric g,,(x^) satisfying [1]. 
In vacuum T,» — 0 and [1] implies R,,, — 0. 

In 1917, Einstein generalized [1] by adding a 
cosmological term Ag (A = const.): 


Ri 一 于 So 及 + Agw = TT w [2] 


A homogeneous and isotropic static solution of [2] 
(with metric [8], k= +1,a=const.), in which the 
“repulsive effect” of A > 0 compensates the gravita- 
tional attraction of incoherent dust (“uniformly 
distributed galaxies") — the Einstein static universe 一 
marked the birth of modern cosmology. Although it is 
unstable and lost its observational relevance after the 
discovery of the expansion of the universe in the late 
1920s, in 2004 a “fine-tuned” cosmological scenario 
was suggested according to which our universe starts 
asymptotically from an initial Einstein static state and 
later enters an inflationary era, followed by a 
standard expansion epoch (see Cosmology: Mathe- 
matical Aspects). There are many other examples of 
“old” solutions which turned out to act as asymptotic 
states of more general classes of models. 


Invariant Characterization 
and Classification of the Solutions 


Algebraic Classification 
The Riemann tensor can be decomposed as 
Raps = Gagsé 十 Eoo T Gagyi [3] 


where E and G are constructed from Rag, R, and 
gas (see, e.g., Stephani et al. (2003)); the Weyl 
conformal tensor C,5,; can be considered as the 
"characteristic of the pure gravitational field" since, 
at a given point, it cannot be determined in terms of 
the matter energy-momentum tensor Tag (as E and 
G can using EEs). Algebraic classification is based 
on a classification of the Weyl tensor. This is best 
formulated using two-component spinors 
QA(A — 1,2), in terms of which any Weyl spinor 
V Agcp determining Cagys can be factorized: 


V Ancp = Q(ADB^/cÓD) [4] 


brackets denote symmetrization; each of the spinors 
determines a principal null direction, say, 
k^ =a4a* (see Spinors and Spin Coefficients). The 
Petrov-Penrose classification is based on coin- 
cidences among these directions. A solution is of 
type I (general case), IT, III, and N (“null”) if all null 
directions are different, or two, three, and all four 
coincide, respectively. It is of type D (“degenerate”) 
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if there are two double null directions. The 
equivalent tensor equations are simplest for type N: 


C, o; e? = 0, PN b ds = 0, 


5 
Cg CO = 0 " 


where Chg.5=(1/2)€ajpoC” 5, € is the Levi-Civita 
pseudotensor. 


Classification According to Symmetries 


Most of the available solutions have some exact 
continuous symmetries which preserve the metric. 
The corresponding group of motions is characterized 
by the number and properties of its Killing vectors £^ 
satisfying the Killing equation (£:g),5 — £5;5 + £5;4 — 0 
(£ is the Lie derivative) and by the nature (spacelike, 
timelike, or null) of the group orbits. For example, 
axisymmetric, stationary fields possess two commuting 
Killing vectors, of which one is timelike. Orbits of the 
axial Killing vector are closed spacelike curves of finite 
length, which vanishes at the axis of symmetry. In 
cylindrical symmetry, there exist two spacelike com- 
muting Killing vectors. In both cases, the vectors 
generate a two-dimensional abelian group. The two- 
dimensional group orbits are timelike in the stationary 
case and spacelike in the cylindrical symmetry. 

If a timelike £^ is hypersurface orthogonal, 
Ea = A6 a for some scalar functions A, ®, the spacetime 
is “static.” In coordinates with € — 0,, the metric is 


g=—e dt + Uy dxídx* [6] 


where U, ~y; do not depend on t. In vacuum, U satisfies 
the potential equation U4 = 0, the covariant derivatives 
(denoted by :) are with respect to the three-dimensional 
metric ?/j,. A classical result of Lichnerowicz states that 
if the vacuum metric is smooth everywhere and U — 0 
at infinity, the spacetime is flat (for refinements, see 
Anderson (2000)). 

In cosmology, we are interested in groups whose 
regions of transitivity (points can be carried into 
one another by symmetry operations) are three- 
dimensional spacelike hypersurfaces (homogeneous 
but anisotropic models of the universe). The three- 
dimensional simply transitive groups G3 were 
classified by Bianchi in 1897 according to the 
possible distinct sets of structure constants but 
their importance in cosmology was discovered only 
in the 1950s. There are nine types: Bianchi I to 
Bianchi IX models. The line element of the Bianchi 
universes can be expressed in the form 


g = -dË + galt)? [7] 


where the time-independent 1-forms w’ = E?dx^ 
satisfy the relations du” = —(1/2)C} w^ Au, d is 


the exterior derivative and C7. are the structure 
constants (see Cosmology: Mathematical Aspects for 
more details). 

The standard Friedmann-Lemaitre-Robertson- 
Walker (FLRW) models admit in addition an 
isotropy group SO(3) at each point. They can be 
represented by the metric 


g — — d? + a(t) ( m + (d+ sin? 222) [8] 


in which a(t), the “expansion factor,” is determined by 
matter via EEs, the curvature index b — — 1,0, +1, the 
three-dimensional spaces t=const. have a constant 
curvature K—&/a^;r € [0,1] for closed (k= 十 1) uni- 
verse, r€ [0,0c6) in open (k—0, —1) universes (for 
another description (see Cosmology: Mathematical 
Aspects). 

There are four-dimensional spacetimes of constant 
curvature solving EEs [2] with T,,,, = 0: the Minkowski, 
de Sitter, and anti-de Sitter spacetimes. They admit the 
same number [10] of independent Killing vectors, but 
interpretations of the corresponding symmetries differ 
for each spacetime. 

If £^ satisfies sgag = 20g, 5, ® = const., it is called a 
homothetic (Killing) vector. Solutions with proper 
homothetic motions, ® Æ 0, are “self-similar.” They 
cannot in general be asymptotically flat or spatially 
compact but can represent asymptotic states of more 
general solutions. In Stephani et al. (2003), a summary 
of solutions with proper homotheties is given; their 
role in cosmology is analyzed by Wainwright and Ellis 
(eds.) (1997); for mathematical aspects of symmetries 
in general relativity, see Hall (2004). 

There are other schemes for invariant classifica- 
tion of exact solutions (reviewed in Stephani et al. 
(2003): the algebraic classification of the Ricci 
tensor and energy-momentum tensor of matter; the 
existence and properties of preferred vector fields 
and corresponding congruences; local isometric 
embeddings into flat pseudo-Euclidean spaces, etc. 


Minkowski (M), de Sitter (dS), 
and Anti-de Sitter (AdS) Spacetimes 


These metrics of constant (zero, positive, negative) 
curvature are the simplest solutions of [2] with T;,, — 0 
and A=0,A >0,A < 0, respectively. The standard 
topology of M is R^. The dS has the topology R! x S? 
and is best represented as a four-dimensional hyper- 
boloid —:?--w?^--x?-- y^ --2^—(3/A) in a five- 
dimensional flat space with metric g — —dv? + du? + 
dx? + dy? + dz*. The AdS has the topology S! x R?; it 
is a four-dimensional hyperboloid —v* — w? +x? + y? 
+2? = —(3/A),A < 0, in flat five-dimensional space 


with signature (— , —, +, +, +). By unwrapping the 
circle $' and considering the universal covering space, 
one gets rid of closed timelike lines. 

These spacetimes are all conformally flat and can 
be conformally mapped into portions of the Einstein 
universe (see Asymptotic Structure and Conformal 
Infinity). However, their conformal structure is 
globally different. In M, one can go to infinity 
along timelike/null/spacelike geodesics and reach 
five qualitatively different sets of points: future/past 
timelike infinity i* ; future/past null infinity Z^, and 
spacelike infinity i. In dS, there are only past and 
future conformal infinies T ,Z*, both being space- 
like (on the Einstein cylinder, the dS spacetime is a 
“horizontal strip" with Z*/Z~ as the “upper/lower 
circle”). The conformal infinity in AdS is timelike. 

As a consequence of spacelike Z^ in dS, there 
exist both particle (cosmological) and event horizons 
for geodesic observers (Hawking and Ellis 1973). dS 
plays a (doubly) fundamental role in the present-day 
cosmology: it is an approximate model for infla- 
tionary paradigm near the big bang and it is also the 
asymptotic state (at t — oo) of cosmological models 
with a positive cosmological constant. Since recent 
observations indicate that A» 0, it appears to 
describe the future state of our universe. AdS has 
come recently to the fore due to the “holographic” 
conjecture (see AdS/CFT Correspondence). 

Christodoulou and Klainermann, and Friedrich 
proved that M, dS, and AdS are stable with respect 
to general, nonlinear (though * weak") vacuum 
perturbations — result not known for any other 
solution of EEs (see Stability of Minkowski Space). 


Schwarzschild and Reissner-Nordstróm 
Metrics 


These are spherically symmetric spacetimes — the 
SO; rotation group acts on them as an isometry 
group with spacelike, two-dimensional orbits. The 
metric can be brought into the form 


g = —e"d) + edr e (dé? + sin? Ody?) [9 


v(t, r), A(t, r) must be determined from EEs. In vacuum, 
we are led uniquely to the Schwarzschild metric 


m -(1-25)ae + (i-2MY dp 
+r (d6* + sin? 0d?) [10] 


where M — const. has to be interpreted as mass, as 
test particle orbits show. The spacetime is static at 
r > 2M, that is, outside the Schwarzschild radius at 
r — 2M, and asymptotically (r — oc) flat. 
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Metric [10] describes the exterior gravitational field 
of an arbitrary (static, oscillating, collapsing, or 
expanding) spherically symmetric body (spherically 
symmetric gravitational waves do not exist). It is the 
most influential solution of EEs. The essential tests of 
general relativity — perihelion advance of Mercury, 
deflection of both optical and radio waves by the Sun, 
and signal retardation — are based on [10] or rather on 
its expansion in M/r. Space missions have been 
proposed that could lead to measurements of “post- 
post-Newtonian” effects (see General Relativity: 
Experimental Tests, and Misner et al. (1973)). The full 
Schwarzschild metric is of importance in astrophysical 
processes involving compact stars and black holes. 

Metric [10] describes the spacetime outside 
a spherical body collapsing through r=2M into 
a spherical black hole. In Figure 1, the formation 
of an event horizon and trapped surfaces is indicated 
in ingoing Eddington—Finkelstein coordinates 
(v,r,0,) where v=t+r+2Mlog(r/2M-— 1) so 
that (v, 0, p) = const. are ingoing radial null geodesics. 
The interior of the star is described by another metric 
(e.g., the Oppenheimer-Snyder collapsing dust solu- 
tion — see below). The Kruskal extension of the 
Schwarzschild solution, its compactification, the con- 
cept of the bifurcate Killing horizon, etc., are analyzed 


S la p | Event horizon 
r=2M 
Trapped surfaces Outgoing 
photon 


NV Infalling 
/N. photon 
\v= const. 


Surface of star 


Figure 1 Gravitational collapse of a spherical star (the interior of 
the star is shaded). The light cones of three events, O, P, Q, at the 
center of the star, and of three events outside the star are illustrated. 
The event horizon, the trapped surfaces, and the singularity formed 
during the collapse are also shown. Although the singularity appears 
to lie along the direction of time, from the character of the light cone 
outside the star but inside the event horizon we can see that it has a 
spacelike character. Reproduced from Bicak J (2000) Selected 
solutions of Einstein’s field equations: their role in general relativity 
and astrophysics. In: Schmidt BG (ed.) Einstein's Field Equations 
and their physical Implications, Lecture Notes in Physics, vol. 540, pp. 
1-126. Heildelberg: Springer, with permission from Springer-Verlag. 
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in Stationary Black Holes and in Misner et al. (1973), 
Hawking and Ellis (1973), and Bicák (2000). 

The Reissner-Nordstróm solution describes the 
exterior gravitational and electromagnetic fields of a 
spherical body with mass M and charge O. The 
energy-momentum tensor on the right-hand side of 
EE [2] is that of the electromagnetic field produced 
by the charge; the field satisfies the curved-space 
Maxwell equations. The metric reads 


«--( "Sae + (1- 


+ P (d0? + sin? 0d?) 


2M 


The analytic extension of the electrovacuum metric 
[11] is qualitatively different from the Kruskal exten- 
sion of the Schwarzschild metric. In the case O^ > M? 
there is a “naked singularity” (visible from r — oc) at 
r — 0 where curvature invariants diverge. If Q? < M?, 
the metric describes a (generic) static charged black 


hole with two event horizons at r 2r, = M 4 (M? — 
Q2)! The Killing vector O/Ot is null at the horizons, 
timelike at r > r, and r < r_, but spacelike between 
the horizons. The character of the extended spacetime 
is best seen in the compactified form, Figure 2, in 
which world-lines of radial light rays are 45° lines. 
Again, two infinities (right and left, in regions I and III) 
arise (as in the Kruskal-Schwarzschild diagram, see 
Stationary Black Holes), however, the maximally 
extended geometry consists of an infinite chain of 
asymptotically flat regions connected by *wormholes" 
between the singularities at 7r— O0. In contrast to 
the Schwarzschild singularity, the singularities are 
timelike — they do not block the way to the future. 
The inner horizon r — r.. represents a Cauchy horizon 
for a typical initial hypersurface like X (Figure 2): what 
is happening in regions V is in general influenced not 
only by data on X but also at the singularities. The 
Cauchy horizon is unstable (for references, see Bicak 
(2000) and recent work by Dafermos (2005)). 
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Figure 2 The compactified Reissner-Nordstróm spacetime representing a non-extreme black hole consists of an infinite chain of 
asymptotic regions (“universes”) connected by “wormholes” between timelike singularities. The world-line of a shell collapsing from 
"universe" / and re-emerging in "universe" /' is indicated. The inner horizon at r= r_ is the Cauchy horizon for a spacelike hypersurface 
5. It is unstable and thus it will very likely prevent such a process. Reproduced from Bicak J (2000) Selected solutions of Einstein's 
field equations: their role in general relativity and astrophysics. In: Schmidt BG (ed.) Einstein's Field Equations and their Physical 
Implications, Lecture Notes in Physics, vol. 540, pp. 1-126. Heildelberg: Springer, with permission from Springer-Verlag. 


For M? =O? the two horizons coincide at r} = 
r_=M. Metric [11] describes extreme Reissner- 
Nordström black holes. The horizon becomes 
degenerate and its surface gravity vanishes (see 
Stationary Black Holes). Extreme black holes play 
a significant role in string theory (Ortin 2004). 


Stationary Axisymmetric Solutions 


Assume the existence of two commuting Killing 
vectors — timelike £^ and axial 7 (£^£, <0,7°".> 0), 
£^ normalized at (asymptotically flat) infinity, 7° at 
the rotation axis. They generate two-dimensional orbits 
of the group G2. Assume there exist 2-spaces orthogo- 
nal to these orbits. This is true in vacuum and also in 
case of electromagnetic fields or perfect fluids whose 
4-current or 4-velocity lies in the surfaces of transitivity 
of G2 (e.g., toroidal magnetic fields are excluded). The 
metric can then be written in Weyl's coordinates 
(t, p, 9,2) 


g = —e""(dt + Adoy 
+e e(d? + dz*) + Pde?) [12] 


U, k, and A are functions of p, z. 

The most celebrated vacuum solution of the form 
[12] is the Kerr metric for which U, k, A are ratios 
of simple polynomials in spheroidal coordinates 
(simply related to (p,z)) The Kerr solution is 
characterized by mass M and specific angular 
momentum a. For a^ > M?, it describes an asymp- 
totically flat spacetime with a naked singularity. For 
a? < M", it represents a rotating black hole that has 
two horizons which coalesce into a degenerate 
horizon for a* = M? — an extreme Kerr black hole. 
The two horizons are located at rz =M + (M?— 
a°)" (r being the Boyer-Lindquist coordinate (see 
Stationary Black Holes)) As with the Reissner- 
Nordstróm black hole, the singularity inside is 
timelike and the inner horizon is an (unstable) 
Cauchy horizon. The analytic extension of the Kerr 
metric resembles Figure 2 (see Frolov and Novikov 
(1998), Hawking and Ellis (1973), Misner et al. 
(1973), Ortín (2004), Semerák et al. (2002), 
Stephani et al. (2003), and Wald (1984) for details). 

Thanks to the black hole uniqueness theorems (see 
Stationary Black Holes), the Kerr metric is the unique 
solution describing all rotating black holes in vacuum. 
If the cosmic censorship conjecture holds, Kerr black 
holes represent the end states of gravitational collapse 
of astronomical objects with supercritical masses. 
According to prevalent views, they reside in the nuclei 
of most galaxies. Unlike. with a spherical collapse, 
there are no exact solutions available which would 
represent the formation of a Kerr black hole. However, 
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starting from metric [12] and identifying, for example, 
z — b — const. and z = —b (with the region —b < z < b 
being cut off), one can construct thin material disks 
which are physically plausible and can be the sources 
of the Kerr metric even for a? > M? (see Bicák (2000) 
for details). 

In a general case of metric [12], EEs in vacuum 
imply the “Ernst equation" for a complex function f 
of p and z: 


(Rf) foo + fæ + fo = +f [13] 


or, equivalently, (Rf)Af =(Vf)*, where f — e?" + ib, 
U enters [12], and b(p,z) is a “potential” for A(p, z): 
A,,=pe*4b,,A,=—pe Ub, k(p,z) in [12] can 
be determined from U and b by quadratures. 
Tomimatsu and Sato (TS) exploited symmetries of 
[13] to construct metrics generalizing the Kerr metric. 
Replacing f by £—(1 — f)/(1 + f), one finds that in 
case of the Kerr metric £^! is a linear function in the 
prolate spheroidal coordinates, whereas for TS 
solutions € is a quotient of higher-order polynomials. 
A number of other solutions of eqn [13] were found 
but they are of lower significance than the Kerr 
solution (cf. Stephani et al. (2003), Chapter 20). 

These solutions inspired “solution-generating meth- 
ods” in general relativity. The Ernst equation can be 
regarded as the integrability condition of a system of 
linear differential equations. The problem of solving 
such a system can be reformulated as the Riemann- 
Hilbert problem in complex function theory (see 
Riemann-Hilbert Problem and Integrable Systems: 
Overview). We refer to Stephani et al. (2003) and 
Belinski and Verdaguer (2001) where these techniques 
using Backlund transformations, inverse-scattering 
method, etc., are also applied in the nonstationary 
context of two spacelike Killing vectors (waves, 
cosmology). In the stationary case, all asymptotically 
flat, stationary, axisymmetric vacuum solutions can, in 
principle, be generated. It is known how to generate 
fields with given values of multipole moments, though 
the required calculations are staggering. By solving the 
Riemann-Hilbert problem with appropriate boundary 
data, Neugebauer and Meinel constructed the exact 
solution representing a rigidly rotating thin disk of 
dust (cf. Stephani et al. (2003) and Bicák (2000)). 

A subclass of metrics [12] is formed by static Weyl 
solutions with "A—b-0. Equation [13] then 
becomes the Laplace equation AU —0. The non- 
linearity of EEs enters only the equations for k: k p = 
p(U*, — UŽ), k ¿z =2pU,,U,z. The class contains some 
explicit solutions of interest: the “linear super- 
position” of collinear particles with string-like 
singularities between them which keep the system 
in static equilibrium; solutions representing external 
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fields of counter-rotating disks, for example, those 
which are “inspired” by galactic Newtonian potentials; 
disks around black holes and some other special 
solutions (Stephani et al. 2003, Bonnor 1992, Bicak 
2000, Semerák et al. 2002). 

There are solutions of the Einstein-Maxwell equa- 
tions representing external fields of masses endowed 
with electric charges, magnetic dipole moments, etc. 
(Stephani et al. 2003). Best known is the Kerr- 
Newman metric characterized by parameters M, a, 
and charge Q. For M2 > a? + Q^ it describes a 
charged, rotating black hole. Owing to the rotation, 
the charged black hole produces also a magnetic field 
of a dipole type. All the black hole solutions can be 
generalized to include a nonvanishing A (for various 
applications, see Semerák et al. 2002)). Other general- 
izations incorporate the so-called Newman-Unti- 
Tamburino (NUT) parameter (corresponding to a 
“sravomagnetic monopole") or an “external” mag- 
netic/electric field or a parameter leading to “uniform” 
acceleration (see Stephani e£ al. (2003) and Bicák 
(2000)). Much interest has recently been paid to black 
hole (and other) solutions with various types of gauge 
fields and to multidimensional solutions. References 
Frolov and Novikov (1998) and Ortín (2004) are two 
examples of good reviews. 


Radiative Solutions 
Plane Waves and Their Collisions 


The best-known class are “plane-fronted gravita- 
tional waves with parallel rays" (pp-waves) which 
are defined by the condition that the spacetime 
admits a covariantly constant null vector field 
k^: b,.5 — O0. In suitable null coordinates u, v such 
that ka = 44, k^? = (0/Ov)", and complex coordinate ¢ 
which spans the wave 2-surfaces u= const., v= 
const. with Euclidean geometry, the metric reads 


g = 2dtd¢ — 2dudv —2H(u,¢,C)du* X [14] 


H(u,C, C) is a real function. The vacuum EEs imply 
H (; — 0 so that 2H — f (u, C) + f (u, C), f is an arbitrary 
function of u, analytic in C. The Weyl tensor satisfies 
eqns [5] - the field is of type N as is the field of plane 
electromagnetic waves. In the null tetrad 
(k^, I^, m? (complex)) with Ika = —1,m°m, — 1, all 
other products vanishing, the only nonzero projection 
of the Weyl tensor, V=Cygysl*m’ lm = H z, 
describes the transverse component of a wave propa- 
gating in the k® direction. Writing V = Ae™, the real 
A » 0 is the amplitude of the wave, © describes 
polarization. Waves with O — const. are called linearly 
polarized. Considering their effect on test particles, 
one finds that plane waves are transverse. 


The simplest waves are homogeneous in the sense 
that V is constant along the wave surfaces. One gets 
f(u,C) — (1/2).A(u)e'? C+. Instructive are “sandwich 
waves," for example, waves with a "square 
profile": A=0 for u < 0 and u > a4^,.A— a? = const. 
for 0 < u < a°. This example demonstrates, within 
exact theory, that the waves travel with the speed of 
light, produce relative accelerations of test particles, 
focus astigmatically generally propagating parallel rays, 
etc. The focusing effects have a remarkable conse- 
quence: there exists no global spacelike hypersurface on 
which initial data could be specified — plane wave 
spacetimes contain no global Cauchy hypersurface. 

“Impulsive” plane waves can be generated by 
boosting a “particle” at rest to the velocity of light by 
an appropriate limiting procedure. The ultrarelativistic 
limit of, for example, the Schwarzschild metric (the so- 
called Aichelburg-Sexl solution) can be employed as a 
“limiting incoming state” in black hole encounters (cf. 
monograph by d’Eath (1996)). Plane-fronted waves 
have been used in quantum field theory. For a review 
of exact impulsive waves, see Semerák et al. (2002). 

A collision of plane waves represents an exceptional 
situation of nonlinear wave interactions which can be 
analyzed exactly. Figure 3 illustrates a typical case in 
which the collision produces a spacelike singularity. The 
initial-value problem with data given at v = 0 and u = 0 
can be formulated in terms of the equivalent matrix 
Riemann-Hilbert problem (see Riemann-Hilbert 
Problem); it is related to the hyperbolic counterpart of 
the Ernst equation [13]. For reviews, see Griffiths 
(1991), Stephani et al. (2003), and Bicák (2000). 


Cylindrical Waves 


Discovered by G Beck in 1925 and known today as the 
Einstein-Rosen waves (1937), these vacuum solutions 
helped to clarify a number of issues, such as energy loss 
due to the waves, asymptotic structure of radiative 


Figure 3 A spacetime diagram indicating a collision of two plane- 
fronted gravitational waves which come from regions // and 中 collide 
in region /, and produce a spacelike singularity. Region /V is flat. 
Reproduced from Bicak J (2000) Selected solutions of Einstein's 
field equations: their role in general relativity and astrophysics. In: 
Schmidt BG (ed.) Einstein's Field Equations and their Physical 
Implications, Lecture Notes in Physics, vol. 540, pp. 1-126. 
Heildelberg: Springer, with permission from Springer-Verlag. 


spacetimes, dispersion of waves, quasilocal mass- 
energy, cosmic censorship conjecture, or quantum 
gravity in the context of midisuperspaces (see Bicák 
(2000) and Belinski and Verdaguer (2001)). 

In the metric 


g=60 9 (di? + dp?) +d? + pe “dp [15] 


u(t, p) satisfies the flat-space wave equation and ^(p, t) 
is given in terms of yw by quadratures. Admitting a 
“cross term" ~ w(t, p) dzdó, one acquires a second 
degree of freedom (a second polarization) which 
makes all field equations nonlinear. 


Boost-Rotation Symmetric Spacetimes 


These are the only explicit solutions available which 
are radiative and represent the fields of finite sources. 
Figure 4 shows two particles uniformly accelerated in 
opposite directions. In the space diagram (left), the 
*string" connecting the particles is the *cause" of the 
acceleration. In *Cartesian-type" coordinates and the 
z-axis chosen as the symmetry axis, the boost Killing 
vector has a flat-space form, ¢ = z(0/0t) + t(0/0z), the 
same is true for the axial Killing vector. The metric 
contains two functions of variables p^ = x? + y? and 
8B? = z? — t. One satisfies the flat-space wave equa- 
tion, the other is determined by quadratures. 

The unique role of these solutions is exhibited by the 
theorem which states that in axially symmetric, locally 
asymptotically flat spacetimes, in the sense that a null 
infinity (see Asymptotic Structure and Conformal 
Infinity) exists but not necessarily globally, the only 
additional symmetry that does not exclude gravitational 


£n 


p 


Figure 4 Two particles uniformly accelerated in opposite 
directions. Orbits of the boost Killing vector (thinner hyperbolas) 
are spacelike in the region f > z*. Reproduced from Bicak J 
(2000) Selected solutions of Einstein's field equations: their role 
in general relativity and astrophysics. In: Schmidt BG (ed.) 
Einstein's Field Equations and their Physical Implications, 
Lecture Notes in Physics, vol. 540, pp. 1-126. Heildelberg: 
Springer, with permission from Springer-Verlag. 
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radiation is the boost symmetry. Various radiation 
characteristics can be expressed explicitly in these 
spacetimes. They have been used as tests in numerical 
relativity and approximation methods. The best-known 
example is the C-metric (representing accelerating black 
holes, in general charged and rotating, and admitting A), 
see Bonnor et al. (1994), Bicák (2000), Stephani et al. 
(2003), and Semerák et al. (2002). 


Robinson-Trautman Solutions 


These solutions are algebraically special but in general 
they do not possess any symmetry. They are governed 
by a function P(u,6,C) (u is the retarded time, Ç a 
complex spatial coordinate) which satisfies a fourth- 
order nonlinear parabolic differential equation. Stud- 
ies by Chru$ciel and others have shown that RT 
solutions of Petrov type II exist globally for all positive 
“times” u and converge asymptotically to a Schwarzs- 
child metric, though the extension across the 
*Schwarzschild-like" horizon can only be made with 
a finite degree of smoothness. Generalization to the 
cases with A > 0 gives explicit models supporting the 
cosmic no-hair conjecture (an exponentially fast 
approach to the dS spacetime) under the presence of 
gravitational waves. See Bonnor et al. (1994), Bicák 
(2000), and Stephani et al. (2003). 


Material Sources 


Finding physically sound material sources in an 
analytic form even for some simple vacuum metrics 
remains an open problem. Nevertheless, there are 
solutions representing regions of spacetimes filled 

with matter which are of considerable interest. 
One of the simplest solutions, the spherically 
symmetric Schwarzschild interior solution with 
ce 


incompressible fluid as its source, represents “a 
star" of uniform density, p — po = const.: 


2 
e=- [Bvr aivan] ae 


"uc 
1— Ar? 
A = 8709/3 = const., R is the radius of the star. 


The equation of hydrostatic equilibrium yields 
pressure inside the star: 


vV1—Ar^—v1-—AR? 
3v1 — AR? — v1 — A? 


Solution [16] can be matched at r= R, where p — 0, 
to the exterior vacuum Schwarzschild solution [10] 
if the Schwarzschild mass M = (1/2)AR?. Although 
“incompressible fluid" implies an infinite speed of 
sound, the above solution provides an instructive 


+ P (d? + sin? Ady’) [16] 


87p = 2A [17] 
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model of relativistic hydrostatics. A Newtonian star of 
uniform density can have an arbitrarily large radius 


R = /3p./27p; and mass M = (pc/ ps) \/6Pe/T, Pe is 
the central pressure. However, [17] implies that (1) M 
and R satisfy the inequality 2M/R < 8/9, (2) equality 
is reached as p.-becomes infinite and R and M attain 
their limiting values Rjm = (37p0) !? = (9/4) Mi. For 
a density typical in neutron stars, pọ = 10? g cm™, we 
get Mj, = 3.96Mo (M. solar mass) — even this simple 
model shows that in Einstein's theory neutron stars can 
only be a few solar masses. In addition, one can prove 
that the *Buchdahl's inequality" 2M/R < 8/9 is valid 
for an arbitrary equation of state p —p(p). Only a 
limited mass can thus be contained within a given 
radius in general relativity. The gravitational redshift 
z—(1— 2M/R) !? —1 from the surface of a static 
star cannot be higher than 2. 

Many other explicit static perfect fluid solutions 
are known (we refer to Stephani et al. (2003) for a 
list), however, none of them can be considered as 
really “physical.” Recently, the dynamical systems 
approach to relativistic spherically symmetric static 
perfect fluid models was developed by Uggla and 
others which gives qualititative characteristics of 
masses and radii. 

The most significant nonstatic spacetime describing 
a bounded region of matter and its external field is 
undoubtedly the Oppenheimer—Snyder model of 
“gravitational collapse of a spherical star” of uniform 
density and zero pressure (a “ball of dust"). The model 
does not represent any new (local) solution: the interior 
of the star is described by a part of a dust-filled FLRW 
universe (cf. [8]), the external region by the Schwarzs- 
child vacuum metric (cf. eqn [10], Figure 1). 

Since Vaidya's discovery of a *radiating Schwarzs- 
child metric," null dust (*pure radiation field") has 
been widely used as a simple matter source. Its 
energy-momentum tensor, T,5-— ok,ks, where 
kak% =0, may be interpreted as an incoherent 
superposition of waves with random phases and 
polarizations moving in a single direction, or as 
“lightlike particles” (photons, neutrinos, gravitons) 
that move along k^. The “Vaidya metric” describing 
spherical implosion of null dust implies that in case 
of a “gentle” inflow of the dust, a naked singulartty 
forms. This is relevant in the context of the cosmic 
censorship conjecture (cf., e.g., Joshi (1993)). 


Cosmological Models 


There exist important generalizations of the stan- 
dard FLRW models other than the above-mentioned 
Bianchi models, particularly those that maintain 
spherical symmetry but do not require homogeneity. 
The best known are the Lemaitre-Tolman—Bondi 


models of inhomogeneous universes of pure dust, 
the density of which may vary (Krasinski 1997). 
Other explicit cosmological models of principal 
interest involve, for example, the Gödel universe ~ a 
homogeneous, stationary spacetime with A « 0 and 
incoherent rotating matter in which there exist 
closed timelike curves through every point; the 
Kantowski-Sachs solutions - possessing homo- 
geneous spacelike hypersurfaces but (in contrast to 
the Bianchi models) admitting no simply transitive 
G3; and vacuum Gowdy models (“generalized 
Einstein—Rosen waves") admitting G2 with compact 
2-tori as its group orbits and representing cosmolo- 
gical models closed by gravitational waves. See 
Cosmology: Mathematical Aspects and references 
Stephani et al. (2003), Belinski and Verdaguer 
(2001), Bicak (2000), Hawking and Ellis (1973), 
Krasinski (1997) and Wainwright and Ellis (1997). 


See also: AdS/CFT Correspondence; Asymptotic 
Structure and Conformal Infinity; Cosmology: 
Mathematical Aspects; Dirac Fields in Gravitation and 
Nonabelian Gauge Theory; Einstein Manifolds; Einstein’s 
Equations with Matter; General Relativity: Experimental 
Tests; General Relativity: Overview; Hamiltonian 
Reduction of Einstein’s Equations; Integrable Systems: 
Overview; Newtonian Limit of General Relativity; 
Pseudo-Riemannian Nilpotent Lie Groups; 
Reimann-Hilbert Problem; Spacetime Topology, Causal 
Structure and Singularities; Spinors and Spin 
Coefficients; Stability of Minkowski Space; Stationary 
Black Holes; Twistor Theory: Some Applications. 


Further Reading 


Anderson MT (2000) On the structure of solutions to the static 
vacuum Einstein equations. Annales Henri Poincaré 1: 995-1042. 

Belinski V and Verdaguer E (2001) Gravitational Solitons. 
Cambridge: Cambridge University Press. 

Bicák J (2000) Selected solutions of Einstein’s field equations: 
their role in general relativity and astrophysics. In: Schmidt 
BG (ed.) Einsteim's Field Equations and Their Physical 
Implications, Lecture Notes in Physics, vol. 540, pp. 1-126, 
(see also gr-qc/0004016). Heidelberg: Springer. 

Bonnor WB (1992) Physical interpretation of vacuum solutions of 
Einstein's equations. Part I. Time-independent solutions. 
General Relativity and Gravitation 24: 551—574. 

Bonnor WB, Griffiths JB, and MacCallum MAH (1994) Physical 
interpretation of vacuum solutions of Einstein's equations. 
Part Il. Time-dependent solutions. General Relativity and 
Gravitation 26: 687-729. 

Dafermos M (2005) The interior of charged black holes and the 
problem of uniqueness in general relativity. Communications 
on Pure and Applied Mathematics LVIII: 445—504. 

D’Eath PD (1996) Black Holes: Gravitational Interactions. 
Oxford: Clarendon Press. 

Frolov VP and Novikov ID (1998) Black Hole Physics. 
Dordrecht: Kluwer Academic. 

Griffiths JB (1991) Colliding Plane Waves in General Relativity. 
Oxford: Oxford University Press. 


Hall GS (2004) Symmetries and Curvature Structure in General 
Relativity. Singapore: World Scientific. 

Hawking SW and Ellis GFR (1973) The Large Scale Structure of 
Space-Time. Cambridge: Cambridge University Press. 

Joshi PS (1993) Global Aspects in Gravitation and Cosmology. 
Oxford: Clarendon. 

Krasinski A (1997) Inhomogeneous Cosmological Models. 
Cambridge: Cambridge University Press. 

Misner C, Thorne KS, and Wheeler JA (1973) Gravitation. 
San Francisco: WH Freeman. 

Ortin T (2004) Gravity and Strings. Cambridge: Cambridge 
University Press. 


Einstein Equations: Initial Value Formulation 173 


Semerák O, Podolský J, and Žofka M (eds. (2002) 
Gravitation: Following the Prague Inpiration. Singapore: 
World Scientific. 

Stephani H, Kramer D, MacCallum MAH, Hoenselaers C, and 
Herlt E (2003) Exact Solutions of Einstein’s Field Equa- 
tions — Second Edition. Cambridge: Cambridge University 
Press. 

Wainwright J and Ellis GFR (eds.) (1997) Dynamical Systems in 
Cosmology. Cambridge: Cambridge University Press. 

Wald RM (1984) General Relativity. Chicago: The University of 
Chicago Press. 


> 


; Einstein Equations: Initial Value Formulation 


J Isenberg, University of Oregon, Eugene, OR, USA 
' © 2006 Elsevier Ltd. All rights reserved. 


Introduction 


Einstein’s theory of gravity models a gravitating 
physical system S using a spacetime (M^, g, v») which 
satisfies the Einstein field equations 


Gw(g) =6Tw(g, Y) [1] 
F (g,w) =0 [2] 


Here, M* is a four-dimensional spacetime manifold, g 
is a Lorentz signature metric on M, w represents the 
nongravitational (“matter”) fields of interest, Ci := 
Ry — (1/2)g,,R is the Einstein curvature tensor, « is a 
constant, T,,, is the stress-energy tensor for the field v, 
and F — 0 represents the nongravitational field equa- 
tions (e.g., VuF! = 0 for the Einstein-Maxwell theory). 

By far the most widely used way to obtain and to 
study spacetime solutions (M^,g,w) of equations 
[1]-[2] is via the initial-value (or Cauchy) formula- 
tion. The idea is as follows: 


1. One chooses a set of initial data D which consists of 
geometric as well as matter information on a 
spacelike slice of M^. This data must satisfy a system 
of constraint equations, which comprise a portion of 
the field equations [1]-[2], and are analogous to the 
Maxwell constraint equation V - E — 0. 

2. One fixes a time and coordinate choice to be used in 
evolving the fields into the spacetime (e.g., maximal 
time slicing and zero shift). This choice should result 
in a fixed set of evolution equations for the data. 

3. Using the evolution equations, one evolves the data 
into the future and the past. From the evolved data, 
one constructs the spacetime solution (M^, g, 1). 


Why is this procedure so popular? First, because 
we have known for over 50 years that at least for 
a short time, it works. That is, as shown by 
Choquet-Bruhat (Foures-Bruhat 1952), the Cauchy 


formulation is well posed. Second, because it fits with 
the way we like to model physical systems. That is, 
we first specify what the system is like now, and we 
then use the equations to determine the behavior of 
the system as it evolves into the future (or the past). 
Third, because the formulation is eminently amenable 
to numerical treatment. Indeed, virtually all numer- 
ical simulations of colliding black hole systems as 
well as of most other relativistic astrophysical systems 
are done using some version of the initial-value 
formulation. Finally, because the initial-value formu- 
lation casts the Einstein equations into a form which 
is readily accessible to many of the tools of geometric 
analysis. Questions such as cosmic censorship are 
turned into conjectures which can be analyzed and 
proved mathematically, and the proofs of both the 
positivity of mass and the Penrose mass inequality 
rely on an initial-value interpretation. 

There are of course drawbacks to the Cauchy 
formulation. Foremost, Einstein’s theory of general 
relativity is inherently a spacetime-covariant theory; 
why break spacetime apart into space plus time when 
covariance has played such a key role in the theory's 
success? As well, we have learned over and over again 
that null cones and null hypersurfaces play a major 
role in general relativity; the initial-value formulation 
is not especially good at handling them. These draw- 
backs show that there are analyses in general relativity 
for which the initial-value formulation may not be well 
suited. However, there is a preponderance of applica- 
tions for which this formulation is an invaluable tool, 
as evidenced by its ubiquitous use. 

A complete treatment of the initial-value formula- 
tion for Einstein’s equations would include discus- 
sion of each of the following topics: 


1. A statement and proof of well-posedness theo- 
rems, including a discussion of the regularity of 
the data needed for such results. 

2. A space + time decomposition of the fields, and a 
formal derivation of the Einstein constraint 
equations and the Einstein evolution equations. 


174 Einstein Equations: Initial Value Formulation 


3. An outline of the Hamiltonian version of the 
initial-value formulation. 

4. A listing of those choices of field variables and 
gauge choices for which the system is manifestly 
hyperbolic. 

5. A description of the known methods for finding 
and parametrizing solutions of the Einstein 
constraint equations. 

6. A comparison of the virtues and drawbacks of 
various choices of time foliation and coordinate 
threading. 

7. A compendium of results concerning long-time 
behavior of solutions. 

8. An account of the difficulties which arise in 
attempts to construct solutions numerically 
using the Cauchy formulation. 

9. A recounting of cases in which the initial-value 
formulation has been used to model physically 
interesting systems. 

10. A note regarding the extent to which the initial- 
value formulation (and the various aspects of it 
just enumerated) generalize to dimensions other 
than 3 + 1 (three space and one time). 

11. A determination of which nongravitational 
fields may be coupled to Einstein’s theory in 
such a way that the resulting coupled theory 
admits an initial-value formulation. 


We do not have the space here for such a 
complete treatment. So we choose to focus on 
those topics directly related to the Einstein con- 
straint equations. Generalizing a bit to the Einstein- 
Maxwell theory (thereby including representative 
nongravitational fields), we first carry out the space 
plus time “3 + 1” decomposition of the gravitational 
and electromagnetic fields. Then, applying the 
Gauss-Codazzi-Mainardi equations to the space- 
time curvature, we turn the spacetime-covariant 
Einstein-Maxwell equations into a set of constraint 
equations restricting the choice of initial data 
together with a set of evolution equations develop- 
ing the data in time. Next, we discuss the most 
widely used approach for obtaining sets of initial 
data which satisfy the constraint equations: the 
conformal method. We include in this discussion 
an account of some of what is known about the 
extent to which the equations which are produced 
by the conformal method admit solutions in various 
situations (e.g., working on a closed manifold, or 
working with asymptotically Euclidean data). We 
then discuss alternate procedures which have been 
used to obtain and analyze solutions of the 
constraints, including the conformal thin sandwich 
approach, the quasispherical method, and various 
gluing procedures. Finally, we make concluding 


remarks. For more details on some of the topics 
discussed here, and for treatment of some of the 


other topics listed above, see the recent review paper 
of Bartnik and Isenberg (2004). 


Space -- Time Field Decomposition 
and Derivation of the Constraint 
Equations 


To understand what sort of initial data one needs to 
choose in order to construct a spacetime via the initial- 
value formulation, it is useful to consider a spacetime 
(M*,g) which satisfies the Einstein (-Maxwell) field 
equations and contains a Cauchy surface ig : X? — M4. 
We note that the existence of a Cauchy surface in 
(M^, g, A) is not automatic; if one exists, the spacetime 
is said to be (by definition) “globally hyperbolic."! 

Among its other properties, a Cauchy surface is a 
spacelike embedded submanifold of a Lorentz 
geometry. It immediately follows that the spacetime 
(M*,g, A) induces on X? a Riemannian metric 4, 
a timelike normal vector field ej, an intrinsic 
(y-compatible) covariant derivative V, and a sym- 
metric “extrinsic curvature" tensor field K (second 
fundamental form). It also follows that certain 
components of the spacetime curvature tensor can 
be written in terms of these Cauchy surface 
quantities (y,e,,V,K) along with other geometric 
quantities related to them, such as the spatial 
curvature R corresponding to the induced covariant 
derivative V (Gauss-Codazzi equations). 

To complete the curvature 3 + 1 decomposition (i.e., 
to carry it out for all components of the spacetime 
curvature), we need not just one Cauchy surface, but 
rather a full local foliation i, : X? — M? of the spacetime 
by such submanifolds. This foliation allows one to 
define e; as a smooth vector field on an open 
neighborhood of the Cauchy surface i9(X?) in M*. It 
also results in a threading of spacetime by a congruence 
of timelike paths (see Figure 1). This threading may be 
viewed as a spacetime-filling family of observers. It also 
defines for the spacetime a set of coordinates relative to 
which one can measure and calculate the dynamics of 
the spacetime geometry. 

It is useful for later purposes to note that at each 
spacetime point p € X, C M* (Here X, :— i,(X?).) the 
vector O/Ot tangent to the threading path through p 
may be decomposed as 


o 
g Neu tX [3] 


"The Taub-NUT spacetime is an example of a spacetime which is 
not globally hyperbolic. 


M* 
ly 
Y3 


Figure 1 3+1 Foliation and threading of spacetime. 


with the “shift vector" X tangent to the surface (X € 
T,3;), and with the “lapse” N a scalar (see Figure 2). 
Using these quantities, we can write the spacetime 
metric in the form 


g—73]—0-90- 
— yn (dx^ + X"dt)(dx^ + X^dr) — Ndi? [4] 


where 0 is the unit length timelike 1-form which 
annihilates all vectors tangent to the hypersurfaces 
of the foliation. 

Relying on the following 3 + 1 decomposition of 
the spacetime-covariant derivative *V (Here {0,} is a 
coordinate basis for the vectors tangent to the 
hypersurfaces of the foliation; {0,,e,} constitutes a 
basis for the full set of spacetime vectors at p.): 


475,05 = Va, — Kape [5] 
Taer = -K lpn [6] 
. 2 ON 
"s. a= -Ki Om + [el Op] + AD [7] 
OnN _ 
OV as el Lyn N On [8] 


Figure 2 Decomposition of the time evolution vector field 0/0t. 
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one readily derives the (Gauss-Codazzi-Mainardi) 
3 -- 1 decomposition of the curvature: ? 


"Rive = Rabe + Ka Kj 一 Ka K; i9] 
3E. = Vo. Kap xad Va, Kac [10] 
“Raib = —Le, Kab — Ko4 Kj + TRAY [11] 


where £ denotes the surface-projected Lie derivative. 

Since we are interested here in the 3 + 1 formula- 
tion of the Einstein-Maxwell system, we need a 3 + 1 
decomposition for the electromagnetic as well as the 
gravitational field. The spacetime 1-form “vector 
potential” ^A pulls back on each Cauchy surface X, 
to a spatial 1-form A. One may then write 


^A = A+ pbt = Aydx" + (Nu + AyX^)dr.— [12] 


for a scalar j4. Based on this decomposition, one has 
the following 3+1 decomposition for the electro- 
magnetic 2-form F: 


Pus == Yack [13] 
^F, = Vo,Ap — Va,Aa [14] 


where F* is the electric vector field. 

We may now use all of these decomposition 
formulas to write out the 14 field equations for the 
Einstein-Maxwell theory 


> Gag = FoF Bu — #8opF "Fw [15] 
“Vi 0 [16] 


in terms of the spatial fields (^, K, N, X; A, E, 1) and 
their derivatives. We obtain 


R—K""K,,--(trK)! -1E"E,--iB"B, [17] 


V,,K" — Vo, (tr K) = e, E" B" 18] 
Le, Kay = Rab — 2K7 Kmo + (tr K) Kap 

+E Byt BB, — [19] 

Va E” =0 20) 

DE =e Va B, [21] 


where c,,. is the alternating Levi-Civita symbol 
(component representation of the Hodge dual), and 
where we have used B, := &"(V5, A, — Va, A4) asa 
convenient shorthand. 


*Here and throughout this article, we use the Misner-Thorne- 
Wheeler (MTW) (Misner et al. 1973) conventions for the 
definition of the Riemann curvature, for the signature 一 十 十 十 
of the metric, for the index labels (Greek indices run over 
(0, 1,2, 3] while Latin indices run over [1, 2, 3]), etc. 
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It is immediately evident that nine of these equations 
([19] and [21]) involve time derivatives of the spatial 
fields, while five of them ([17], [18], and [20]) do not. 
Thus, we may split the field equations of the Einstein 一 
Maxwell theory into two sets: (1) the constraint 
equations [17], [18], and [20], which restrict our 
choice of the  Einstein-Maxwell initial data 
(y, K, A,E); and (2) the evolution equations, which 
describe how to evolve the data (», K, A, E) in time, 
presuming that one has also prescribed (freely!) the 
“atlas fields" (N, X, 4).? We note that the complete 
system of evolution equations for the Einstein 一 
Maxwell field equations includes equations which are 
based on the definitions of K and E. Written in terms 
of (surface-projected) Lie derivatives along 9/Ó,, the 
full system takes the form 


Loab = —2NKap EB Lx Vab [22] 


La Kap =N (Rap — 2K7 K mb + K Kap + EE; + B4B;) 


= Va, Va, N 23 Lx Kap [23] 
LaAa = N(E, SLE Va) + LyAa [24] 
La E” = NOV a Bn + LyE* [25] 


As noted earlier, well-posedness theorems^ guar- 
antee that initial data satisfying the constraint 
equations [17], [18], and [20] on a manifold X? 
can always at least locally be evolved into a 
spacetime solution (X? x I,g,*A) (for I some inter- 
val in R!) of the Einstein-Maxwell equations. We 
now turn our attention to the issue of finding sets of 
data which do satisfy the constraints. 


The Conformal Method 


We seek to find sets of data (y,K,A,E) on a 
manifold X? which satisfy the constraint equations 


R—K""K,,-r(tK) -1E"E,--1B"B, [26] 
VK? — V. (tr K) = c,E" B" [27] 
Vs, Ef =0 [28] 


>The collective name “atlas field" for the lapse N, the shift X, the 
electric potential jz, and other such fields which are neither 
constrained by the constraint equations nor evolved by the 
evolution equations, derives from their role in controlling the 
evolution of coordinate charts and bundle atlases in the course of 
the construction of spacetime solutions of relativistic field 
equations like the Einstein Maxwell system. 

^While the work cited earlier (Foures-Bruhat 1952) proves well 
posedness for the vacuum Einstein equations only, the extension 
to the Einstein- Maxwell system is straightforward 


(Here and below, for convenience, we replace Va, 
by V,.) This is an underdetermined problem, with 
five equations to be solved for 18 functions. 

The idea of the conformal method is to divide the 
initial data on X? into two sets — the “free (conformal) 
data," and the “determined data" — in such a way that, 
for a given choice of the free data, the constraint 
equations become a determined elliptic partial differ- 
ential equation (PDE) system, to be solved for the 
determined data. There are a number of ways to do this; 
we focus here on one of them - the “semidecoupling 
split" or *method A." After describing this version of 
the conformal method, and discussing what one can do 
with it, we note some of its drawbacks and then later (in 
the next section) consider some alternatives. (See 
Choquet-Bruhat and York (1980) and Bartnik and 
Isenberg (2004) for a more complete discussion of these 
alternatives.) 

For the Einstein-Maxwell theory, the split of the 
initial data is as follows: 


Free (“conformal”) data 
Aj — a Riemannian metric, specified up to 
conformal factor; 


cj; — a  divergence-free'(V/o; —0), — tracefree 
(Mo;; = 0); symmetric tensor; 

T -— à scalar field; 

Q4 — a 1-form; 

£^ — a divergence-free vector field; 

Determined data 

@ — a positive-definite scalar field; 

W' — a vector field; 

< — a scalar field. 


For a given choice of the free data, the five 
equations to be solved for the five functions of the 
determined data take the form 


Aé=0 [29] 
Vin( LW)? —2$9WVa,r + €amnE” B" [30] 

Ad "en i Ro u i (o™ die LW) din JE LW mn)” 
T. (E"Em +B" Bm) HETG) [31] 


where the Laplacian A and the scalar curvature R 
are based on the A,;-compatible covariant derivative 
V;, where L is the corresponding conformal Killing 
operator, defined by 


(LW), := VaWy-- V,Wa - E Aab Vm W [32] 


*In the free data, the divergence-free condition is defined using the 
Levi-Civita-covariant derivative compatible with the conformal 
metric Ajj. 


and where (,:=€7'"(Va,,Qn—Va,Qm). Presuming 
that for the chosen free data one can indeed solve 
equations [29]-|31] for £,¢, and W, then the initial 
data (y, K, A, E) constructed via the formulas 


"Jab =" Nab [33] 

Kab — 0 ^ (oy + LWab) +40 Aur [34] 
A, — a [35] 

E" = 6 *(£, + Vas£) [36] 


satisfy the Einstein- Maxwell constraint equations 
[26]-[28]. 

Before discussing the extent to which one can 
solve equations [29]-[31] and consequently use the 
conformal method to generate solutions, we wish to 
comment on how these equations are derived. Three 
formulas are key to this derivation. The first is the 
formula for the scalar curvature of the metric 
Yab = Éf Agp, expressed in terms of the scalar curva- 
ture for Àj, and derivatives of ¢: 


R(y) 2 9 *R(A) - 8A4ó [37] 


We note that if we were to use a different power of @ as 
the conformal factor multiplying Xp then this formula 
would involve squares of first derivatives of @ as well. 
The second key formula relates the divergence of a 
traceless symmetric tensor p,, with respect to the 
covariant derivatives Vy) and VIA compatible with 
conformally related metrics. One obtains 


Vy) Pmb = $ ^V (d^ Pmb) [38] 


The third key formula does the same thing for a 
vector field C^: 


V (ym GC" = PEV (Pbm) [39] 


In addition to helping us derive equations [29]-[31] 
from the substitution of formulas [33]-[36] into 
[26]-[28], these key formulas indicate to some 
extent how the choice of the explicit decomposition 
of the initial data into free and determined data is 
made (see Isenberg, Maxwell, and Pollack for 
further elaboration). 

It is easy to see that there are some choices of the 
free data for which [29]-[31] do not admit any 
solutions. Let us choose, for example, X? to be the 
3-sphere, and let us set A to be the round sphere 
metric, o to be zero everywhere, 7 to be unity 
everywhere, and both a and £ to vanish everywhere. 
We then readily determine that eqn [29] requires 
that £ be constant and that eqn [30] requires that 
LW,, be zero. The remaining equation [31] now 
takes the form Aó = (1/8)Ró + (1/12)9?. Since the 
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right-hand side of this equation is positive definite 
(recall the requirement that ó > 0), it follows from 
the maximum principle on closed (compact without 
boundary) manifolds that there is no solution. 

In light of this example, one would like to know 
exactly for which sets of free data eqns [29]-[31] can 
be solved, and for which sets they cannot. Since one 
readily determines that every set of initial data which 
satisfies the Einstein-Maxwell constraints [26]-[28] 
can be obtained via the conformal method, such a 
classification effectively provides a parametrization 
of the space of solutions of the constraints.° 

What we know and do not know about classifying 
free data for the solubility of eqns |29]-[31] is 
largely determined by whether or not the function 7 
is chosen to be constant on X?. If 7 is chosen to be 
constant, then eqns [29]-[31] effectively decouple, 
and the classification is essentially completely 
known. Sets of initial data generated from free 
data with constant 7 are called *constant mean 
curvature” (CMC) sets, since the mean curvature of 
the initial slice embedded in its spacetime develop- 
ment is given by 7. We also know a considerable 
amount about the classification if |V7| is sufficiently 
small (“near CMC”), while virtually nothing is 
known for the general non-CMC case. 

A full account of the classification results known 
to date is beyond the scope of this article. Indeed, 
such an account must separately deal with a number 
of alternatives regarding manifold and asymptotic 
conditions (data on a closed manifold; asymptoti- 
cally Euclidean data; asymptotically hyperbolic 
data; data on an incomplete manifold with bound- 
aries) and regularity (analytic data, smooth data, 
C* data, or data contained in various Holder or 
Sobolev spaces), among other things. We will, 
however, now summarize some of the results; see, 
for example, Bartnik and Isenberg (2004) or 
Choquet-Bruhat for more complete surveys. 


CMC Data on Closed Manifolds 


Generalizing the S? example given above, we note 
that for any set of free data (0%, Aab, Cabs T, aa EP) 
with constant 7 and with no conformal Killing 
fields, eqn [29] is easily solved for £, and then eqn 
[30] takes the form 


Vm LWY = tonne D. [40] 


^Of course, in claiming that appropriate sets of the free data 
parametrize the space of solutions of the constraints, one needs to 
determine if inequivalent sets of free data are mapped to the same 
set of solutions. We discuss this below. 
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which is a linear elliptic PDE for W,, with 
invertible operator.’ This equation admits a unique 
solution, and then the problem of solving the 
constraints reduces to the analysis of the “Lichner- 
owicz equation” [31]. 

To determine if this equation admits a solution for 
the given set of free data, we use the following 
classification criteria: (1) The metric is labeled 
positive Y*(X2), zero Y?(X?), or negative Y (D>) 
Yamabe class depending upon whether the metric 
Mp on X? can be conformally deformed so that its 
scalar curvature is everywhere positive, everywhere 
zero, or everywhere negative." (2) The (Tab, Qa, EP) 
portion of the data is labeled either = or Æ, 
depending upon whether the quantity o,,0"" + 
E” Em + B"Bm is identically zero, or not. (3) The 
mean curvature 7 is labeled “max” or “nonmax” 
depending upon whether the constant 7 is zero or 
not. In terms of these criteria, we have 12 classes of 
free data, and one can prove (Choquet-Bruhat and 
York 1980, Isenberg 1995) the following: 


e Solutions exist for the classes (Y+, Z, max), (V", Æ, 
nonmax), (X°, =, max), (V°, #, max), (Y^, =, 
nonmax), (Y^, #,nonmax) and 

e Solutions do not exist for the classes (Y', =, 
max), (y, =, max), (Yr, *. max). 


This classification is exhaustive, in the sense that 
every set of CMC data on a closed manifold fits 
neatly into exactly one of the classes. We note that 
the proofs of existence of solutions can generally be 
done using the sub-super solution technique, while 
the nonexistence results follow from application of 
the maximum principle. 


Maximal Asymptotically Euclidean Data 


Just as is the case for data on a closed manifold, 
the constraint equations [29] and [30] decouple 
from the Lichnerowicz equation [31] for asympto- 
ticaly Euclidean data with constant 7. We note 
that +40 is inconsistent with the data being 


^A metric A has a conformal Killing field if the equation LY =0 
has a nontrivial solution Y. Geometrically, the existence of a 
conformal Killing field Y indicates that the flow of (X?, Aap) along 
Y is a conformal isometry. While free data with nonvanishing 
conformal Killing fields can be handled, for convenience we shall 
stick to data without them here. 

*Work on the Yamabe problem (Aubin 1998) shows that every 
Riemannian metric on a closed manifold is contained in one and 
only one of these classes. In fact, the Yamabe theorem (Schoen 
1984) shows that every metric can be conformally deformed so 
that its scalar curvature is +1,0, or —1, but this result is not 
needed for the analysis of the constraint equations. 


asymptotically Euclidean, so we restrict to the 
maximal case, 7 — 0. 

The criterion for solubility of the constraints in 
conformal form for maximal asymptotically Eucli- 
dean free data is quite a bit simpler to state than 
that for CMC data on a closed manifold. It 
involves the metric A only; the rest of the free 
data is irrelevant. Specifically, as shown by Brill 
and Cantor (with a correction by Maxwell (2005)), 
a solution exists if and only if for every nonvanish- 
ing, compactly supported, smooth function f on X?, 
we have 


inf Ju Vf] + RÉP) det _ 


| 0 41 
{f 40} TAI an 


Alternative Methods for Finding Solutions 
to the Constraint Equations 


While the conformal method has proved to be a 
very useful tool for generating and analyzing 
solutions of the Einstein constraint equations, it 
does have some minor drawbacks: (1) The free data 
is remote from the physical data, since the 
conformal factor can vastly change the physical 
scale on different regions of space. (2) While 
casting the constraints into a determined PDE 
form has the advantage of producing PDEs of a 
relatively familiar (elliptic) form, one does give up 
certain flexibilities inherent in an underdetermined 
set of PDEs. (We expand upon this point below in 
the course of discussing gluing.). (3) In choosing a 
set of free data, one does have to first project out a 
divergence-free vector field (€) and a divergence- 
free tracefree tensor field (c). (4) While the choice 
of CMC free data for the conformal method is 
conformally covariant in the sense that conformally 
related sets of CMC free data (5°, Aab, 045, T, Qa, E) 
and (32,051,, 0 20bsT:Qa,9°E”’) produce the 
same physical solution to the constraints, this is 
not the case for non-CMC free data. 


Conformal Thin Sandwich 


The last two of these problems can be removed by 
modifying the conformal method in a way which 
York (1999) has called the “conformal thin sand- 
wich” (CTS) approach. The basic idea of the CTS 
approach is the same as that of the conformal 
method. However, CTS free data sets are larger — 
the divergence-free tracefree symmetric tensor field 
g is replaced by a tracefree symmetric tensor field U, 


and an extra scalar field 7 is added — and after 
solving the CTS constraint equations 


Aé=0 [42] 
Tml (20)  (LX)); 2286V,r + «,,,£" p" 
+Vm((2n) tus) 143) 
Ao =R -I(U""--LY"")(U,, + LY) 7 
Td (EE, + 8" Bm)? 4-1 729? [44] 


for the vector field Y and the conformal factor 4, 
one obtains not just the full set of physical initial 
data satisfying the constraint equations [26]-[28] 


Yab =O" Nab [45] 
Kap =® ^(-U,,--LY,)-i9^A,r [46] 
Ap — aj [47] 
E*—6-59(6, + V,£) [48] 
but also the lapse N and shift X 
N= Be [49] 
Ame y" [50] 


Clearly, in using the CTS approach, one need not 
project out a divergence-free part of a symmetric 
tracefree tensor. One also readily checks that the 
CTS method is conformally covariant in the sense 
discussed above: the physical data generated from 
CTS free data (Asb, Uab 7,1, Qa, EP) and from data 
(053,5, 07 Ughs T, O°, Qa, 0 £^) are the same. 
Furthermore, since the mathematical form of eqns 
[42]-[44] is very similar to that of [29]-[31], the 
solvability results for the conformal method can be 
essentially carried over to the CTS approach. 

There is, however, one troubling feature of the 
CTS approach. The problem arises if we seek CMC 
initial data with the lapse function chosen so that 
the evolving data continue to have. CMC (such a 
gauge choice is often used in numerical relativity). In 
the case of the conformal method, after solving 
[29]-[|31] to obtain initial data (Y+b, Kap, Aa, E^) 
which satisfies the constraints, one achieves this by 
proceeding to solve a linear homogeneous elliptic 
PDE for the lapse function. One easily verifies that 
solutions to this extra equation always exist. By 
contrast, in the CTS approach, the extra equation 
takes the form 


A(’n) =} nR +4 (n) (U—LX) 
+4 (n) (& 8, 
+Y” Va 7-8 [51] 
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which is coupled to the system [42]-[44]. The 
coupling is fairly intricate; hence little is known 
about the existence of solutions to the system, and it 
has been seen that there are problems with unique- 
ness. Such problems of course do not arise if one 
makes no attempt to preserve CMC. 


The Quasispherical Ansatz and Parabolic Methods 


Applying either the conformal method or the CTS 
approach to the constraint equations results in 
systems of elliptic equations. Another approach, 
pioneered by Bartnik (1993), produces instead 
parabolic equations. In the simplest version of this 
approach, known as the “quasispherical ansatz,” 
one works on a manifold X? = R? \B3, where B; is a 
3-ball; one presumes that there exist coordinates 
(r,0, à) on X? in terms of which the metric takes the 
“quasispherical” form 


yos — 1 dr? + (rd0 + g^dr)* 
+ (rsin0 do + 8?dry? [52] 


for functions u(r,6,¢), 3(r,0,6), B°(r,0,¢), and 
then one attempts to satisfy the time-symmetric 
constraint Rops) =0 on 53.” Calculating the scalar 
curvature for the metric in this form, one finds that 
the equation Ros) — 0 can be written as 


(rd, — 3° — 8^8,)u — w^ ^u 
= O(u, 3°, B?,r,0, $) [53] 


where O is a polynomial in the positive function z. 
One can now show that if one specifies 3° and 3° 
everywhere on X? (subject to an upper bound on the 
divergence of the vector field (5^, 3°)), and if one 
specifies regular initial data for u on the inner 
boundary of X?, then one has a well-posed initial- 
value problem (in terms of the *evolution" coordi- 
nate r) for the parabolic PDE [53]. Ideally, one can 
use this approach to extend solutions of the time- 
symmetric constraints from an isolated region 
(corresponding to B3) out to spatial infinity. 

The basic quasispherical ansatz approach just 
outlined can be generalized significantly (Sharples 
2001, Bartnik and Isenberg 2004) to allow for more 
general spatial metrics, and to allow nonzero 
Kab, Ac, and E^. It has been an especially valuable 
tool for the study of mass in asymptotically 
Euclidean data sets. It does not, however, purport 
to construct general solutions of the constraint 
equations. 


"This version of the constraints is called *time symmetric" since 
one is solving the full set of constraints with K,, assumed to be 
zero. Data with Kap — 0 is time symmetric. 
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Gluing Solutions of the Constraint Equations 


Starting around the year 2000, a number of new 
“gluing” procedures have been developed for con- 
structing and studying solutions of the constraint 
equations. Unlike the conformal method, the CTS 
method, and the quasispherical ansatz, all of which 
construct solutions from scratch, the gluing proce- 
dures construct new solutions from given ones. This 
feature, and the considerable flexibility of the 
procedures, has resulted in a wealth of applications 
already in the short five-year history of gluing in 
general relativity. 

One of the gluing approaches, developed by 
Corvino (2000) and Corvino and Schoen (preprint) 
(see also Chruściel and Delay (2002)), allows one to 
choose a compact region 2 in almost any smooth, 
asymptotically Euclidean vacuum solution of the 
constraints, and from this produce a new smooth 
solution which is completely unchanged in the 
region Q and is identical to Schwarzschild or Kerr 
outside some larger region. In proving this result, 
one exploits the underdetermined character of the 
constraint equations: such a construction could not 
be carried out if the constraints were a determined 
PDE system. '? 

The other main gluing approach, developed first by 
Isenberg et al. (2001), and then further developed with 
Chruściel (Chruściel et al. 2005) and with Maxwell 
(Isenberg et al. 2005), starts with a pair of solutions of 
the (vacuum) constraints (X2,^1, Kı) and (5, 52, K2) 
together with a choice of a pair of points pı € X7, p2 € 
33. one from each solution. From these solutions, this 
gluing procedure produces a new set of initial data 
(à 250-2) K(1-2)) with the following properties: 
(1) Xa-2) is ditfeomanphie to the connected sum 
3 135925 (2. ) (z 1-2» ?1—2) Ka 2)) ! is a solution of the 
constraints ea on Y _9)3 (3) On that portion 
of Xà 5, which corresponds to X3 V ball around p1], 
the data (^(1-2), K(1-2)) is isomorphic to (y1, Kı), with 
a Wet property holding on that portion of 
33 which corresponds to X3 VMball around p2} (see 
Figure 3). 

This connected sum gluing can be carried out for 
very general sets of initial data. The sets can be 
asymptotically Euclidean, asymptotically hyperbolic, 
specified on a closed manifold, or indeed anything 


"Hence if one tries to do Corvino Schoen-type gluing using a 
fixed conformal geometry, the gluing fails because the determined 
elliptic system satisfies the unique continuation property. 

''The connected sum of the two manifolds (see property (1)) is 
constructed as follows: first we remove a ball from each of the 
manifolds ©} and Xj. We then use a cylindrical bridge S? x I 
(where I is an interval in R!) to connect the resulting S7 
boundaries on each manifold 


(Zi, 4, Kil (Z3, 39. Ko} 


eee at wo 


3 
(à 2) 901-25 Ka-2 


Figure 3 Connected sum gluing. 


else. The only condition that the data sets must 
satisfy is that, in sufficiently small neighborhoods of 
each of the points at which the gluing is to be done, 
there do not exist nontrivial solutions € to the 
equation DO% (4,0 — 0, where DO% y, is the opera- 
tor obtained ia taking the adición ‘of the linearized 
constraint operator." In work by Beig, Chruściel, 
and Schoen, it is shown that this condition (some- 
times referred to as “No KIDs,” meaning “no 
(localized) Killing initial data)” is indeed generically 
satisfied. 

While a discussion of the proof that connected 
sum gluing can be carried out to this degree of 
generality is beyond the scope of this paper (see 
Chru$ciel et al. (2005), along with references cited 
therein for details of the proof), we note three 
features of it: first, the proof is constructive in the 
sense that it outlines a systematic, step-by-step 
mathematical procedure for doing the gluing. In 
principle, one should be able to carry out the gluing 
procedure numerically. Second, connected sum glu- 
ing relies primarily on the conformal method, but it 
also uses a nonconformal deformation at the end 
(dependent on the techniques of Corvino and 
Schoen, and of Chru$ciel and Delay), so as to 
guarantee that the glued data is not just very close to 
the given data on regions away from the bridge, but 
is indeed identical to it. Third, while Corvino- 
Schoen gluing has not yet been proved to work for 
solutions of the constraints with source fields, 
connected sum gluing (up to the last step, which 
relies on Corvino-Schoen) has been shown to work 
for most matter source fields of interest (Isenberg 
et al.). It has also been shown to work for general 
dimensions greater than or equal to three. 


When a solution to this equation does exist on some region 
A € Xj, it follows from the work of Moncrief that the spacetime 
development of the data on A admits a nontrivial isometry. 


While gluing is not an efficient tool for studying 
the complete set of solutions to the constraints, it 
has proved to be very valuable for a number of 
applications. We note a few here. 


1. Spacetimes with regular asymptotic structure. 
Until recently, it was not known whether there 
is a large class of solutions which admit the 
conformal compactification and consequent 
asymptotically simple structure at null and space- 
like infinity characteristic of the Minkowski and 
Schwarzschild spacetimes. Using Corvino-Schoen 
gluing, together with Friedrich’s analyses of 
spacetime asymptotic structures and an argument 
of Chru$ciel and Delay (2002), one produces 
such a class of solutions. 

2. Multi-black hole data sets. Given an asymptoti- 
cally Euclidean solution of the constraints, con- 
nected sum gluing allows a sequence of (almost) flat 
space initial data sets to be glued to it. The bridges 
that result from this gluing each contain a minimal 
surface, and consequently an apparent horizon. 
With a bit of care, one can do this in such a way 
that indeed the event horizons which appear in the 
development of this glued data are disjoint, and 
therefore indicative of independent black holes. 

3. Adding a black bole to a cosmological space- 
time. Although there is no clear established 
definition for a black hole in a spatially compact 
solution of Einstein's equations, one can glue an 
asymptotically Euclidean solution of the constraints 
to a solution on a compact manifold, in such a way 
that there is an apparent horizon on the bridge. 
Studying the nature of these solutions of the 
constraints, and their evolution, could be useful in 
trying to understand what one might mean by a 
black hole in a cosmological spacetime. 

4. Adding a wormhole to your spacetime. While 
we have discussed connected sum gluing as a 
procedure which builds solutions of the con- 
straints with a bridge connecting two points on 
different manifolds, it can also be used to build a 
solution with a bridge connecting a pair of points 
on the same manifold. This allows one to do the 
following: if one has a globally hyperbolic 
spacetime solution of Einstein's equations, one 
can choose a Cauchy surface for that solution, 
choose a pair of points on that Cauchy surface, 
and glue the solution to itself via a bridge from 
one of these points to the other. If one now 
evolves this glued-together initial data into a 
spacetime, it will likely become singular very 
quickly because of the collapse of the bridge. 
Until the singularity develops, however, the 
solution is essentially as it was before the gluing, 
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with the addition of an effective wormhole. 
Hence, this procedure can be used to glue a 
wormhole onto a generic spacetime solution. 

5. Removing topological obstructions for constraint 
solutions. We know that every closed three- 
dimensional manifold M? admits a solution of 
the vacuum constraint equations. To show this, 
we use the fact that M? always admits a metric T 
of constant negative scalar curvature. One easily 
verifies that the data (y=Ir,K=I) is a CMC 
solution. Combining this result with connected 
sum gluing, one can show that for every closed 
X?, the manifold X? V {p} admits both an asymp- 
totically Euclidean and an asymptotically hyper- 
bolic solution of the vacuum constraint equations. 

6. Proving tbe existence of vacuum solutions 
on closed manifolds with no CMC Cauchy 
surface. Based on the work of Bartnik (1988) 
one can show that if one has a set of initial data 
on the manifold T’#T? with the metric compo- 
nents symmetric across a central sphere and the 
components of K skew symmetric across that 
same central sphere, then the spacetime develop- 
ment of that data does not admit a CMC Cauchy 
surface. Using connected sum gluing, one can 
show that indeed initial data sets of this sort exist 
(Chrusciel et al. 2005). 


Conclusion 


Much is known about the Einstein constraint 
equations and those sets of initial data which satisfy 
them. We know how to use the conformal method 
or the CTS approach to construct (and parametrize 
in terms of free data) the CMC and near CMC sets 
of data which solve the constraints, with or without 
matter fields present. We know how to use the 
quasispherical approach to explore extensions of 
solutions of the constraint equations from compact 
regions. We know how to use gluing techniques to 
produce new solutions of both physical and math- 
ematical interest from old ones, and we know how 
to use gluing as a tool for proving such results as the 
existence of vacuum spacetimes with no CMC 
Cauchy surfaces. 

There is much that is not yet known as well. Very 
little is known about solutions of the constraint 
equations which have neither CMC nor near CMC. It 
is not known how to systematically extend solutions of 
the constraints from a compact region to all of R? in 
such a way that the extension is asymptotically 
Euclidean (unless we know a priori that such an 
extension exists). Very little is known regarding how 
to control the constraints during the course of 
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numerical evolution of solutions.'? Most importantly, 
we do not yet know how to systematically find 
solutions of the constraint equations which serve as 
physically realistic model initial data sets for studying 
astrophysical and cosmological systems of interest. 

Many of these questions concerning the Einstein 
constraints and their solutions are fairly daunting. 
However, in view of the rapid progress in our 
understanding during the last few years, and in view 
of the pressing need to further develop the initial- 
value formulation as a tool for studying general 
relativity and gravitational physics, we are optimis- 
tic that this progress will continue, and we will soon 
have answers to a number of these questions. 
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originally appeared in relativity, but it is of 
tremendous interest from the point of view of pure 
mathematics. Demanding a metric of constant 
sectional curvature is a very strong condition, 
while metrics of constant scalar curvature always 
occur. The Einstein property, which is essentially a 
constant-Ricci-curvature condition, occupies an 
intermediate position between these conditions, and 
it is still not clear exactly how strong it is. In 


dimensions higher than four, it is still unknown 
whether there are obstructions to a manifold 
admitting an Einstein metric. 

The study of Einstein manifolds is a vast and 
rapidly expanding area, and this article can merely 
touch on some points of particular interest. The 
focus of the article is very much on the Riemannian 
rather than Lorentzian case (see, e.g., Hawking and 
Ellis (1973) or the articles by Christodoulou and 
Tod in LeBrun and Wang (1999) for a discussion of 
the Lorentzian case in general relativity). For further 
reading, the books of Besse (1987) and LeBrun and 
Wang (1999) are strongly recommended. 


Basic Properties 


Let (M,g) be a (pseudo)-Riemannian manifold. 
There is a unique connection V, the Levi-Civita 
connection of g, with the following properties: 


1. the torsion T(X,Y) - Vx Y - Vy X — [X,Y] 
vanishes and 
2. Vg=0 


We can now form the Riemann curvature tensor 
of g: 


R(X, Y)Z = VxVyZ — VyVxZ — Vixy2 


This is a type (3,1) tensor. There is one nontrivial 
contraction we can perform to obtain a (2, 0) tensor, 
that is, the Ricci curvature 


Ric(X, Y) = tr(Z R(X, Z)Y) 


We may perform a further contraction and obtain 
the scalar curvature s=tr, Ric. 

The Ricci curvature is a symmetric tensor of the 
same type as the metric, so we can make the 
following definition: 


Definition 1 A metric g is Einstein if 
Ric = Ag [1] 
for some constant A. 


In this article, we shall take g to be a Riemannian 
(positive-definite) metric. 


Remark 1 In dimension higher than 2, we do not 
have to put in the assumption that A is constant by 
hand. For, taking the divergence of [1] gives 
(1/2)ds=dA, while taking instead the trace gives 
s—nM, so if n Æ 2, we see dA —O. 


Remark 2 In dimension 2 and 3, the Einstein 
condition is equivalent to constant curvature. The 
only complete Einstein manifolds in these dimen- 
sions are therefore the model spaces $",R" and 
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hyperbolic space, and quotients of these by discrete 
groups of isometries. 


Remark 3 As noticed by Hilbert, the Einstein 
equations admit a variational interpretation. They 
are the variational equations for the total scalar 
curvature functional 


g= | Sg dug 
M 


restricted to the space of volume 1 metrics (here dyg 
denotes the volume form defined by g). 


Obstructions 


The most fundamental question, we can ask is: 
Given a smootb manifold M does it support an 
Einstein metric? 

One is also interested in the question of unique- 
ness of such a metric, or more generally of 
describing the moduli space of such metrics. 

In this section we discuss obstructions to existence. 
In dimension 2 Remark 2 shows that any compact 
manifold admits an Einstein metric, while in dimen- 
sion 3 the only possibilities are space forms. In 
particular, there is no Einstein metric on S$! x SŽ. 

The picture is much less clear in higher dimen- 
sions. If A > 0, one obtains some elementary 
obstructions just by considering the sign of the 
Ricci curvature: 


1. If M supports a complete Einstein metric with 
A > 0, then by Myers’s theorem M is compact 
and 7\(M) is finite. Also there are obstructions 
coming from the positivity of the scalar curvature 
(e.g., if M is spin and 4m-dimensional, then the A 
genus vanishes). 

2. If M supports a complete Ricci-flat metric, then 
every finitely generated subgroup of mı(M) has 
polynomial growth. 


However, if dim M > 5, there is, at the time of 
writing, no known obstruction to M supporting an 
Einstein metric of negative Einstein constant. 

In the borderline dimension 4, Hitchin and Thorpe 
observed that the Einstein condition put topological 
constraints on the manifold. for, we have the 
following expressions for the Euler characteristic x 
and signature 7 in terms of the curvature tensor: 


1 2 2 
r= a |, Wal? - IW-P dy, 


1 s? 
x= ga |, WP + Iw-P — Rico? +55 dy, 
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where W, and W_ are the self-dual and anti- 
self-dual parts of the Weyl tensor, s is the scalar 
curvature, and Rico is the trace-free part of the Ricci 
tensor. 

The Einstein condition is just Rico =0, so we 
immediately obtain the following inequality. 


Theorem (Hitchin 1974). A compact four- 
dimensional Einstein manifold satisfies tbe inequality 


| 7 |< 4x 


Note that equality is obtained if and only if g is 
Ricci-flat and (anti)-self-dual, which is equivalent to 
locally hyper-Kahler for some orientation. The only 
examples are the flat torus, the K3 surface with the 
Yau metric (now T=16 and x=24), and two 
quotients of K3. 

Since the mid-1990s, LeBrun (2003) has obtained 
a series of results which sharpen the Hitchin—Thorpe 
inequality by obtaining estimates on the Weyl and 
scalar curvature terms. These estimates are obtained 
by using Seiberg—Witten theory, the general theme 
being that nonemptiness of the Seiberg—Witten 
moduli space gives lower bounds on the curvature 
terms. LeBrun shows there are infinitely many 
compact smooth simply connected 4-manifolds 
that satisfy the Hitchin-Thorpe inequality but 
nonetheless do not admit Einstein metrics. 


Uniqueness and Moduli 


In Yang-Mills theory, there is a highly developed 
theory of moduli spaces of instantons, including 
formulas for the dimension. The situation for 
Einstein metrics is far less well understood. The 
relevant moduli space here is the set of Einstein 
metrics modulo the action of the diffeomorphism 
group, but there are very few manifolds for which 
the moduli space has been determined. In dimension 
2, of course, this is essentially the subject of the 
Teichmuller theory. 

One example where the moduli space is under- 
stood is the K3 surface. As explained above, the 
Hitchin-Thorpe argument shows that any Einstein 
metric is hyper-Kahler, and the moduli space of such 
structures on K3 is understood as an open set in a 
certain noncompact symmetric space. 

Some uniqueness results have been obtained in 
four dimensions. LeBrun used Seiberg—Witten tech- 
niques to show that the Einstein metric on a 
compact quotient of the complex hyperbolic plane 
CH? is unique up to homotheties and diffeomorph- 
isms. The analogous result for compact quotients of 
real hyperbolic 4-space was obtained using entropy 
methods by Besson, Courtois, and Gallot. It is still 


unknown, however, whether nonstandard Einstein 
metrics can exist on S*, 

In higher dimensions, very little is known. One can, 
by analogy with the theory of instantons, consider the 
linearization of the Einstein equations together with a 
further linear equation expressing orthogonality to 
the orbits of the diffeomorphism group. This gives a 
notion of formal tangent space to the Einstein moduli 
space. However, Koiso has shown that formal 
tangent vectors need not integrate to a curve of 
Einstein metrics. The structure of the moduli space 
(dimension, possible singularities) remains quite 
mysterious in general. It is known from the Wang- 
Ziller torus bundle examples that the moduli space 
can have infinitely many components. 


Special Holonomy 


Berger classified the possible holonomy groups of 
simply connected, irreducible, nonsymmetric n- 
dimensional Riemannian manifolds. The generic 
case is that of holonomy SO(z), and there are six 
other possibilities, each of which corresponds to 
some special geometry. Interestingly, four of these 
are automatically Ricci-flat, while a fifth is Einstein 
with A Æ 0. The remaining example, that of Kahler 
geometry, is not automatically Einstein, but the 
Einstein equations with the additional Kahler 
assumption reduce to a scalar Monge-Ampere 
equation and are therefore simpler than the general 
Einstein system. 

For further reading in this section, see the articles 
by Boyer-Galicki, Joyce, Salamon, Tian, Yau and 
the author in part I of LeBrun and Wang (1999), 
and also the book of Joyce (2000). For the Kahler 
case, see also Tian (2000). 


Kahler Manifolds (Holonomy U(n/2), SU(n/2)) 


A Kahler manifold (M,g) admits ,a covariant 
constant complex structure /, and associated 
Kahler 2-form w defined by w(X, Y)=g(IX,Y). 
The Ricci form p is defined by p(X,Y)= 
Ric(IX, Y), so the Einstein condition for a Kahler 
manifold becomes 


p = Aw 


On a Kahler manifold, p is the curvature of the 
canonical bundle, so [p/27] is a representative for 
the cohomology class c1(M). 

We see that a necessary condition for a complex 
manifold (M, I) to admit a Kahler—Einstein metric is 
that cl has a definite sign. We consider, in turn, the 
three cases: 


cy < O 
In this case, we have: 


Theorem (Aubin, Yau). Let (M,I) be a compact 
complex manifold with cl < 0. Then (M, I) admits 
a Kàbler-Einstein metric with A < 0. The metric is 
unique up to bomotbety. 


EL 
This is a special case of the Calabi conjecture, 
proved by Yau. 


Theorem (Yau). Let M be a compact Kabler 
manifold with Kahler form w. For any closed real 
form p of type (1,1) with [p/2x|=c,(M), there 
exists a unique Kahler metric with Kabler form 
cohomologous to w and Ricci form equal to p. 

In particular, if M is a compact Kahler manifold 
with cı =0, there exists a Ricci-flat Kabler metric 
on M. 


Ricci-flat Kahler metrics are called Calabi-Yau 
metrics, and are exactly the metrics with holonomy 
in SU(n/2). They admit two parallel spinors and are 
of great interest to string theorists, because in some 
string theories spacetime is expected to be a product 
of the four-dimensional macroscopic factor with a 
compact Calabi-Yau manifold of complex dimen- 
sion 3. 

Yau's theorem provides many examples of 
Calabi-Yau spaces. For example, we can take a 
nonsingular complex submanifold defined as a 
complete intersection by the vanishing of r poly- 
nomials of degree di,...,d, in CP". Now, M has 
complex dimension n — r and c; =0 if and only if 
n--1—9; ,d;. We obtain examples of complex 
dimension 2 by considering a quartic in CP, the 
intersection of a quadric and a cubic in CP^, or the 
intersection of three quadrics in CP”; these all give 
examples of K3 surfaces. A famous example of a 
Calabi-Yau manifold of complex dimension 3 is 
given by the quintic in CP*. This technique can be 
extended, for example, by considering complete 
intersections in weighted projective space or con- 
structing Calabi-Yau desingularizations of singular 
spaces. 


cl > 0 

This case is the most complicated and, at the time of 
writing, is not yet fully understood. It is known that 
not every compact manifold with c; > 0 supports a 
Kahler—Einstein metric. 

An early result of Matsushima was that the identity 
component of the automorphism group of a Kahler— 
Einstein space with c; » 0 must be reductive. 
This shows, for example, that the blow-up of CP? at 
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one or two points does not admit a Kahler—Einstein 
metric, despite having cl > 0. (The one-point blow-up 
does admit a Hermitian-Einstein metric due to 
Page.) A second obstruction is the Futaki invariant, a 
character of the Lie algebra of the automorphism 
group. This character vanishes if there is a Kahler— 
Einstein metric. 

Both the above obstructions depend on having a 
nontrivial algebra of holomorphic automorphisms of 
M. More recently, Tian has discovered further 
obstructions (in complex dimension 3 or higher) 
which can be present even if the automorphism 
algebra is trivial. 

However, for compact complex surfaces with 
cı > 0, Tian has proved that vanishing of the 
Futaki invariant is sufficient. In particular, the 
blow-up of CP? at k points in general position, 
where 3 < k < 8, admits a Kahler—Einstein metric 
(note that c? =9 — k so if k > 8 then c is no longer 
definite). 

LeBrun—Catanese and Kotschick used these results 
to give an example of a topological 4-manifold 
carrying Einstein metrics of different signs. A 
deformation of the Barlow surface (a surface of 
general type) has cı <0 and hence carries an 
Einstein metric with A < 0. But this space is home- 
omorphic (though not diffeomorphic) to the blow- 
up of CP? at eight points, which carries an Einstein 
metric with A » 0. One may use this example 
to construct higher-dimensional examples of diffeo- 
morphic manifolds carrying Einstein metrics of 
opposite sign. 


Hyper-Kahler Manifolds (Holonomy Sp(n/4)) 


These are always Ricci-flat. They have a triple 
(1,], K) of covariant constant complex structures, 
satisfying the quaternionic multiplication relations 
Il] -K— — JI, etc., and defining Kahler forms 
wi, wj wk. Hyper-Káhler manifolds of dimension 
n —4N have N + 1 parallel spinors. 

The most effective way of producing complete 
hyper-Káhler metrics has been the hyper-Kahler 
quotient construction (Hitchin et al. 1987), which 
was motivated by the Marsden-Weinstein quotient 
in symplectic geometry. Let G be a group acting 
freely on a hyper-Káhler manifold (M,g,1,J,K) 
preserving the hyper-Kahler structure. Subject to 
mild assumptions, we obtain a G-equivariant 
moment map 4: M — q* & R^, satisfying 


dux(Y) "S (wr(X, Y), wi(X, Y), wr(X, Y)) 


Now the quotient J !(0)/G is a hyper-Kihler 
manifold of dimension dim M — 4 dim G. 
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The power of this construction comes from the 
fact that even if M is just flat quaternionic space, 
one can obtain highly nontrivial quotients by 
suitable choice of group G (e.g., the asymptotically 
locally Euclidean four-dimensional examples of 
Kronheimer, which include as a subcase the multi- 
instanton metrics of Gibbons and Hawking). 

Many examples of interest in mathematical 
physics may be obtained by taking hyper-Kahler 
quotients of an infinite-dimensional space of con- 
nections and Higgs fields (Hitchin 1987). Examples 
include moduli spaces of instantons over a hyper- 
Kahler base, moduli spaces of monopoles on Rj, 
and moduli spaces of Higgs pairs over a Riemann 
surface. 

The hyper-Kahler manifolds produced so far by 
the quotient construction have all been noncompact. 
Examples of compact hyper-Kahler manifolds are 
rarer but some are known. Beauville has produced 
examples in all dimensions as desingularizations of 
symmetric products of the basic four-dimensional 
compact examples (K3 and the 4-torus). 

Further material for this section may be found, for 
example, in Hitchin (1992) and in the chapter by the 
author on hyper-Káhler manifolds in LeBrun and 
Wang (1999). 


Quaternionic Kahler Manifolds (Holonomy Sp(n/4)) 
Sp(1)) 


These are always Einstein with nonzero Einstein 
constant. Instead of globally defined parallel com- 
plex structures as in the hyper-Kahler case, we have 
a sub-bundle G of End(TM) with fiber isomorphic to 
the imaginary quaternions, parallel with respect to 
the Levi-Civita connection. Thus, we have locally 
defined almost-complex structures I, J, K, satisfying 
the quaternionic multiplication relations, such that 
covariant differentiation of one of I, J, K gives a 
linear combination of the other two. In particular, 
note that quaternionic Kahler manifolds are not 
Kahler. 

If the Einstein constant A is positive, the only 
known complete examples are symmetric, the so- 
called compact Wolf spaces, which are in one-to-one 
correspondence with the compact simple Lie groups. 
It is conjectured that these are the only examples 
with A > 0, and some results in this direction have 
been established (e.g., it is known if dim M < 12). It 
is also known that for fixed dimension, there are 
only finitely many types of compact quaternionic 
Kahler manifold with A > 0. 

Many orbifold examples, however, are known to 
exist, for example, via the Galicki-Lawson quater- 
nionic Káhler quotient construction. 


If A < 0, more complete examples are known. In 
addition to the noncompact duals of the Wolf 
spaces, there are homogeneous, nonsymmetric 
examples due to Alekseevski, and infinite-dimen- 
sional families of inhomogeneous examples con- 
structed via twistor methods by LeBrun (see also 
Biquard (2000)). 


Exceptional Holonomy (G» or Spin(7)) 


Such metrics exist in dimension 7 or 8, respectively. 
They are always Ricci-flat and admit a parallel 
spinor. Local examples were constructed by Bryant 
using Cartan-Káhler theory, and some explicit 
complete noncompact examples were produced by 
Salamon and -Bryant using a cohomogeneity-1 
construction. More complicated explicit noncom- 
pact examples have recently been produced by 
several authors (see Cvetié et al. (2003) for a 
survey). Compact examples were produced using 
analytical methods by Joyce, and later by Kovalev. 
Joyce starts with a flat singular metric on quotients 
of the seven- or eight-dimensional torus and con- 
structs an approximate solution to the special 
holonomy condition on a resolution of this singular 
space. Then an analytic argument is used to show 
that an exact nearby solution exists. 

For further reading, consult Joyce (2000) as well 
as the article by Joyce in LeBrun and Wang 
(1999). 

There are also some interesting examples of 
Einstein metrics which, although not of special 
holonomy themselves, are closely related to special 
holonomy geometries. In recent years, these have 
yielded many new examples of compact Einstein 
manifolds in the work of Boyer, Mann, Galicki, 
Kollar, Rees, Piccinni, and Nakamaye. 


Einstein-Sasaki Structures 


There are several different ways of defining these, 
but the simplest is to say that (M,g) is Einstein- 
Sasaki if the cone (R x M, d£? + ?g) is Ricci-flat 
Kahler. Also, an Einstein-Sasaki manifold has a 
circle action with quotient a Kahler—Einstein orbi- 
fold. Existence theorems for such orbifold metrics 
have led to many examples of Einstein-Sasaki 
metrics, including families on odd-dimensional 
spheres. 


3-Sasakian Structures 


Again, we can define these in terms of cones; (M, g) 
has a 3-Sasakian structure if the cone over it is 
hyper-Kahler. The basic example is $*"*? with 
associated cone H” — {0}. A 3-Sasakian manifold is 
always Einstein with positive Einstein constant. 


The hyper-Kahler quotient construction induces 
a 3-Sasakian quotient, and many examples of 
compact 3-Sasakian manifolds have been produced 
as 3-Sasakian quotients of S$*"*?. In particular, there 
are examples in dimension 7 with arbitrarily large 
second Betti number, showing that one cannot, in 
general, expect compactness/finiteness results for 
Einstein moduli spaces without further assumptions. 


Homogeneous Examples 


Another strategy to study the Einstein equations is 
to reduce the difficulty of the problem by imposing 
symmetries. More precisely, we consider Einstein 
manifolds (M;g) with an isometric action of a Lie 
group G. In general, the Einstein equations with this 
symmetry will now involve r independent variables 
where r is the dimension of the stratified space 
M/G. We call r the cobomogeneity of the manifold. 

In this section, we consider the situation where 
(M, gj is homogeneous, that is, when the action of G 
is transitive so r=0. The Einstein equations now 
reduce to a system of algebraic equations. 

We may now write M— G/K, where K is the 
stabilizer of a point of M. We choose an Ady- 
invariant vector space complement p to f in q, and 
identify p with the tangent space to G/K at the 
identity coset. The key point is that G-invariant 
metrics on M — G/K may now be identified with 
Adx-invariant inner products on p, which may, in 
turn, be studied by looking at the decomposition of 
p into irreducible representations of K. 

In the special case when G/K is isotropy irredu- 
cible (i.e., p is an irreducible representation of K), 
both the metric g and its Ricci tensor are propor- 
tional by Schur's lemma, and hence g is automati- 
cally Einstein. Isotropy-irreducible homogeneous 
spaces have been classified by Kramer, Manturov, 
Wolf, and- Wang-Ziller. 

In the general case, the Einstein equations become 
a system of polynomial equations. Determining 
whether this system has a real positive solution is, 
in general, a highly nontrivial problem. However, 
the situation of homogeneous metrics is one area in 
which the variational formulation of the Einstein 
equations has proved highly successful. 

We are now considering the scalar curvature 
functional on the finite-dimensional space of unit 
G-invariant metrics on G/K. The behavior of the 
scalar curvature functional is related to the structure 
of the lattice of intermediate subalgebras between 
the Lie algebras of K and G. 

An early result along these lines (Wang and Ziller 
1986) is that if K is maximal in G (compact), then 
G/K admits a G-invariant Einstein metric. The idea 
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of the proof is to show that maximality of K forces 
the scalar curvature functional on the space of 
volume-1 homogeneous metrics to be both bounded 
above and proper, and therefore to have a 
maximum. 

These ideas have been greatly extended by Bóhm, 
Wang, and Ziller. Given a compact connected 
homogeneous space G/K, they define a graph 
whose vertices are Ad(K)-invariant subalgebras 
strictly intermediate between q and f. The edges 
correspond to inclusions between subalgebras. 
A component of the graph is called toral if all 
subalgebras f) in this component are such that the 
identity component of H/K is abelian. They now 
show that if the graph has at least two nontoral 
components, then G/K admits a G-invariant Einstein 
metric. The Einstein metrics in the theorem are 
produced by a mountain pass argument and may 
have co-index 1, contrasting with the maxima of the 
earlier theorem. 

Further advances in this direction have recently 
been made by Bóhm. He associates to G/K a 
simplicial complex, and shows that nonzero homo- 
logy groups of the complex imply the existence of 
higher co-index Einstein metrics. 

One can also study homogeneous noncompact 
Einstein spaces with A <0. It is conjectured by 
Alekseevski that for all such examples K is a 
maximal compact subgroup of G. The reader is 
referred to Heber (1998) for further information on 
the noncompact case. 

The above results give some powerful existence 
results for Einstein metrics. However, there are 
examples known of homogeneous spaces G/K 
which admit no G-invariant Einstein metric (Wang 
and Ziller 1986). One such example is SU(4)/SU(2), 
where SU(2) is a maximal subgroup of 
Sp(2) C SU(4). 

Techniques similar to those in the homogeneous 
case have been used to construct Einstein metrics on 
total spaces of certain bundles, via Riemannian 
submersions. Some highlights are Jensen’s exotic 
Einstein metrics on (4n + 3)-dimensional spheres, 
and the Wang-Ziller metrics on total spaces of torus 
bundles over products of Kahler—Einstein manifolds. 
The latter construction gives examples of spaces 
admitting volume-1 Einstein metrics with infinitely 
many Einstein constants A. 


Examples of Higher Cohomogeneity 


One can also look for Einstein metrics of higher 
cohomogeneity. Most progress has been made in the 
cohomogeneity-1 case, that is, where the principal 
orbit G/K of the action has real codimension one in 
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M (see Eschenburg and Wang (2000) for back- 
ground on such metrics). On the open dense set in M 
which is the union of the principal orbits, we may 
write the metric as 


di* + g, 


where g; is a t-dependent homogeneous metric on 
G/K. The Einstein equations are now a system of 
ordinary differential equations in 7. 

One may also add a special orbit G/H at one or 
both ends of the interval over which ¢ ranges. This 
will impose boundary conditions on the ODEs. For 
the manifold structure to extend smoothly over the 
special orbit, H/K must be a sphere. Notice that if 
A > 0, then to obtain a complete metric M must be 
compact, so we must add two special orbits. If A < 0 
and the metric is irreducible, then a Bochner 
argument tells us that M is noncompact. In the Ricci- 
flat case, the Cheeger-Gromoll theorem tells us that to 
obtain a complete irreducible metric, we must have 
exactly one special orbit, so M is topologically the 
total space of a vector bundle over the special orbit. 
In fact, most of the known examples even with A < 0 
have a special orbit too. 

The system of ODEs we obtain is still highly 
nonlinear and difficult to analyze in general. How- 
ever, there are certain situations in which the 
equations, or a subsystem, can be solved in closed 
form. If we take G/K to be a principal circle bundle 
over a Hermitian symmetric space, Bérard Bergery 
(1982) showed that the resulting Einstein equations 
are solvable. (His work was inspired by the earlier 
example of Page, which corresponds to the case 
when G/K —U(2)/U(1), a circle bundle over CP!.) 
In fact, Bérard Bergery's construction works in 
greater generality as we obtain the same equations 
if G/K is replaced by any Riemannian submersion 
with circle fibers over a positive Kahler—Einstein 
space. This illustrates a general principle that 
systems arising as cohomogeneity-1 Einstein equa- 
tions also typically arise from certain bundle ansatze 
without homogeneity assumptions. 

Wang and Wang generalized this construction to 
be the case when the hypersurface in M is a 
Riemannian submersion with circle fibers over a 
product of an arbitrary number of Kahler—Einstein 
factors. Other solvable Einstein systems have been 
studied by, for example, Wang and Dancer. 

It may also be possible in certain situations to get 
existence results without an explicit solution. This 
observation underlies the important work of Böhm 
(1998). He constructs cohomogeneity-1 Einstein 
metrics on certain manifolds with dimension 
between 5 and 9, including all the spheres in this 


range of dimensions. The equations are not now 
solved in closed form, but it is possible to get a 
qualitative understanding of the flow and to show 
that certain trajectories will give metrics on the 
desired compact manifolds. 

Böhm has also shown, in an analogous result to 
the homogeneous case, that there are examples of 
manifolds with a cohomogeneity-1 G-action which 
do not support any G-invariant Einstein metric. 

So far, not much is known about Einstein metrics 
of higher cohomogeneity. An exception is the 
situation of self-dual Einstein metrics in dimension 
4, where the self-dual condition greatly simplifies 
the resulting equations. Calderbank, Pedersen, and 
Singer have achieved a good understanding of such 
metrics with T^ symmetry, including construction of 
such metrics on Hirzebruch—Jung resolutions of 
cyclic quotient singularities. 


Analytical Methods 


So far there is no really general analytical method 
for proving existence of global Riemannian Einstein 
metrics (although, of course, such techniques do exist 
in more restrictive situations of special holonomy). 

Although the Einstein equations admit a variational 
formulation, this has (except for homogeneous metrics) 
not yielded general existence results. Note that the 
Wang-Ziller torus bundle examples at the end of the 
section “Homogeneous examples" show that the 
Palais-Smale condition does not hold in full generality. 

One early suggestion was to adopt a minimax 
procedure. In each conformal class [g], one looks for 
a minimizer of the volume-normalized scalar curva- 
ture. Such a minimizer always exists. One then takes 
the supremum over all conformal classes. The 
resulting supremum of the functional is called the 
Yamabe invariant Y(M) of the manifold M. If a 
maximizer g exists, and Y(M) < 0, then g is Einstein. 

However, striking work of Petean shows that this 
procedure must fail to produce an Einstein metric in 
many cases. He proves that if dim M > 5 and M is 
simply connected, then the Yamabe invariant is non- 
negative. So, for such an M, any Einstein metric 
produced will have A > 0, and we know that this 
puts constraints on the topology of M. 

Another possible technique is to use the Hamilton 
Ricci flow. If this converges as t — oc, the limiting 
metric is Einstein. However, it seems hard in higher 
dimensions to get control over the flow. In parti- 
cular, the Wang-Ziller example in the section 
“Homogeneous examples" of a homogeneous space 
with no invariant Einstein metric shows that the 
flow may fail to converge (the Hamilton flow 
preserves the property of G-invariance). 


Graham-Lee and Biquard have used analytical 
methods to produce Einstein deformations of hyper- 
bolic space (real, complex, quaternionic, or Cayley). 
The idea is to show that a sufficiently small deforma- 
tion of the conformal infinity of hyperbolic space can 
be extended to a deformation of the hyperbolic metric. 

Recently, Anderson has shown the existence of 
Einstein metrics with A <0 on a large class of 
manifolds obtained by Dehn filling from hyperbolic 
manifolds with toral ends. The strategy is to glue on 
to the hyperbolic metric copies of a simple explicit 
asymptotically hyperbolic metric, and to show that 
the resulting metric can be perturbed to an exact 
solution of the Einstein equations. 


See also: Einstein Equations: Exact Solutions; Einstein 
Equations: Initial Value Formulation; Hamiltonian 
Reduction of Einstein's Equations; Several Complex 
Variables: Compact Manifolds; Singularities of the Ricci 
Flow. 
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Introduction 
Notation 


Standard notation and terminology of differential 
geometry and general relativity are used in this 
article. All considerations are local, so that the four- 
dimensional spacetime M is assumed to be a smooth 
manifold diffeomorphic to R4. It is endowed with 
a metric tensor g of signature (1,3) and a linear 
connection defining the covariant differentiation of 
tensor fields. Greek indices range from 0 to 3 and 
refer to spacetime. Given a field of frames (e,,) on M, 
and the dual field of coframes (6), one can write the 
metric tensor as g—g,,0"0", where g,,— g(e,,e,) 
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Asterisque 


and Einstein’s summation convention is assumed to 
hold. Tensor indices are lowered with g and raised 
with its inverse g”. General-relativistic units are 
used, so that both Newton’s constant of gravitation 
and the speed of light are 1. This implies b=/, 
where / = 107? cm is the Planck length. Both mass 
and energy are measured in centimeters. 


Historical Remarks 


The Einstein-Cartan theory (ECT) of gravity is a 
modification of general relativity theory (GRT), 
allowing spacetime to have torsion, in addition to 
curvature, and relating torsion to the density of 
intrinsic angular momentum. This modification 
was put forward in 1922 by Elie Cartan, 
before the discovery of spin. Cartan was influenced 
by the work of the Cosserat brothers (1909), who 
considered besides an (asymmetric) force stress 
tensor also a moments stress tensor in a suitably 
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generalized continuous medium. Work done in the 
1950s by physicists (Kondo, Bilby, Króner, and 
other authors) established the role played by torsion 
in the continuum theory of crystal dislocations. A 
recent review (Ruggiero and Tartaglia 2003) 
describes the links between ECT and the classical 
theory of defects in an elastic medium. 

Cartan assumed the linear connection to be metric 
and derived, from a variational principle, a set of 
gravitational field equations. He required, without 
justification, that the covariant divergence of the 
energy-momentum tensor be zero; this led to an 
algebraic constraint equation, bilinear in curvature 
and torsion, severely restricting the geometry. This 
misguided observation has probably discouraged 
Cartan from pursuing his theory. It is now known 
that conservation laws in relativistic theories of 
gravitation follow from the Bianchi identities and, in 
the presence of torsion, the divergence of the 
energy-momentum tensor need not vanish. Torsion 
is implicit in the 1928 Einstein theory of gravitation 
with teleparallelism. For a long time, Cartan's 
modified theory of gravity, presented in his rather 
abstruse notation, unfamiliar to physicists, did not 
attract any attention. In the late 1950s, the theory of 
gravitation with spin and torsion was independently 
rediscovered by Sciama and Kibble. The role of 
Cartan was recognized soon afterward and ECT 
became the subject of much research; see Hehl et al. 
(1976) for a review and an extensive bibliography. 
In the 1970s, it was recognized that ECT can be 
incorporated within supergravity. In fact, simple 
supergravity is equivalent to ECT with a massless, 
anticommuting Rarita-Schwinger field as the source. 
Choquet-Bruhat considered a generalization of ECT 
to higher dimensions and showed that the Cauchy 
problem for the coupled system of Einstein-Cartan 
and Dirac equations is well posed. Penrose (1982) 
has shown that torsion appears in a natural way 
when spinors are allowed to be rescaled by a 
complex conformal factor. ECT has been general- 
ized by allowing nonmetric linear connections and 
additional currents, associated with dilation and 
shear, as sources of such a “metric-affine theory of 
gravity" (Hehl et al. 1995). 


Physical Motivation 


Recall that, in special relativity theory (SRT), the 
underlying Minkowski spacetime admits, as its 
group of automorphisms, the full Poincaré group, 
consisting of translations and Lorentz transforma- 
tions. It follows from the first Noether theorem 
that classical, special-relativistic field equations, 
derived from a variational principle, give rise to 


conservation laws of energy-momentum and angu- 
lar momentum. Using Cartesian coordinates (x^), 
abbreviating Ov /Ox^ to'p „ and denoting by ?"" and 
sive = —sg"P the tensors of energy-momentum and 
of intrinsic angular momentum (spin), respectively, 
one can write the conservation laws in the form 


P n = 0 [1] 
and 


(xl t"? — x" HP 十 dt = [2] 


In the presence of spin, the tensor 7/"" need not be 
symmetric, 


pu qn 一 PP 
Belinfante and Rosenfeld have shown that the tensor 


THY — p” Le 5 (su? 4 s" Ph 二 su") 


p 


is symmetric and its divergence vanishes. 

In quantum theory, the irreducible, unitary repre- 
sentations of the Poincaré group correspond to 
elementary systems such as stable particles; these 
representations are labeled by the mass and spin. 

In Einstein's GRT, the spacetime M is curved; the 
Lorentz group — but not the Poincaré group — appears 
as the structure group acting on orthonormal frames 
in the tangent spaces of M. The energy-momentum 
tensor T appearing on the right-hand side of the 
Einstein equation is necessarily symmetric. In GRT 
there is no room for translations and the tensors t 
and s. 

By introducing torsion and relating it to s, Cartan 
restored the role of the Poincaré group in relativistic 
gravity: this group acts on the affine frames in the 
tangent spaces of M. Curvature and torsion are the 
surface densities of Lorentz transformations and 
translations, respectively. In a space with torsion, 
the Ricci tensor need not be symmetric so that an 
asymmetric energy-momentum tensor can appear 
on the right-hand side of the Einstein equation. 


Geometric Preliminaries 
Tensor-Valued Differential Forms 


It is convenient to follow Cartan in describing 
geometric objects as tensor-valued differential 
forms. To define them, consider a homomorphism 
c:GL4(R) —^ GLN(R) and an element A=(A*) 
of End RÍ, the Lie algebra of GL4(R). The derived 
representation of Lie algebras is given by 


d VAH 
gpI (EXP At)|,—0 ae 0, A, 


If (e,) is a frame in R^, then Ov (ea) — 0 ej, where 
il. Das. ME 

A map a=(a",):M 一 GLa(R) transforms fields 
of frames so that 


| _ VU v argh 
e,=e,a, and =a0 [3] 


A differential form y on M, with values in RN, is said 
to be of type c if, under changes of frames, it 
transforms so that y’ = c(a ! )g. For example, 0 = (0^) 
is a 1-form of type id. If now A = (A): M — End R^, 
then one puts a(t)=exptA:M — GL4(R) and 
defines the variations induced by an infinitesimal 
change of frames, 


66 = 5 (a(t) *0)|, y = —A0 

dt 

a [4] 
5p = = (alalt) Np) = -ejAbe 


Hodge Duals 


Since M is diffeomorphic to R*, one can choose an 
orientation on M and restrict the frames to agree 
with that orientation so that only transformations 
with values in GL; (R) are allowed. The metric then 
defines the Hodge dual of differential forms. Put 
0, g,,0", The forms mm lues, and muss, are 
defined to be the duals of 1,0,,0, ^ 0,,0,, ^ 0, A Op; 
and 0, ^0, ^ 0, ^ 05, respectively. The 4-form 7 is 
the volume element; for a holonomic coframe 
0"— dx", it is given by 4/—det(g,,)dx? ^ dx! A 
dx? ^dx?. In SRT, in Cartesian coordinates, one 


can define the tensor-valued 3-forms 
r= and ssp [5] 


so that eqns [1] and [2] become 

di^ —0 and dj"-—0 
where 

j" = x"t" — x tt + s" [3 


For an isolated system, the 3-forms t” and j"", 
integrated over the 3-space x°=const., give the 
system's total energy-momentum vector and angular 
momentum bivector, respectively. 


Linear Connection, Its Curvature and Torsion 


A linear connection on M is represented, with 
respect to the field of frames, by the field of 1-forms 


uU" = [^ alg 


y pv 
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so that the covariant derivative of e, in the direction 
of e, is Ve, =T? ep. Under a change of frames [3], 
the connection forms transform as follows: 
Ius fp — p u 
ahw, = what + dat 
If p= p'ea is a k-form of type c, then its covariant 


exterior derivative 


Dy? = dyf + o wr A p’ 


vt 


is a (k + 1)-form of the same type. For a 0-form one 
has Dy? =6"V,,~". The infinitesimal change of w, 
defined similarly as in [4], is 6w = DA". The 2-form 
of curvature Q — (Q^,), where 


OH = du/ + wh Awe 


is of type ad: it transforms with the adjoint 
representation of GL4(R) in End R*. The 2-form of 
torsion O = (O^), where 


OF = do" + ut ^ 6 


is of type id. These forms satisfy the Bianchi 
identities 


DOF, =0 and DO =LA 


For a differential form y of type o, the following 
identity holds: 


2 ) 
Dp! = of P, ^" [7] 
The tensors of curvature and torsion are given by 
Qr = 5 R” yp? ^ 0? 
and 
| 1 ) 
e" -一 5 Q^ 0 A^ i? 


respectively. With respect to a holonomic frame, 
dé“ = 0, one has 


^ —TH 2T 
O" pa = S. L5, 


In SRT, the Cartesian coordinates define a radius-vector 
field X" = —x", pointing towards the origin of the 
coordinate system. The differential equation it satisfies 
generalizes to a manifold with a linear connection: 


DX" --0^ —0 [8] 
By virtue of [7], the integrability condition of [8] is 
Q",X" + OF =0 


Integration of [8] along a curve defines the Cartan 
displacement of X; if this is done along a small 
closed circuit spanned by the bivector Af, then the 
radius vector changes by about 


AX? = 3 (Rt,po X" + OF UAI 
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This holonomy theorem — rather imprecisely for- 
mulated here — shows that torsion bears to transla- 
tions a relation similar to that of curvature to linear 
homogeneous transformations. 

In a space with torsion, it matters whether one 
considers the potential of the electromagnetic field to be 
a scalar-valued 1-form y or a covector-valued 0-form 
(Yu). The first choice leads to a field dy that is invariant 
with respect to the gauge transformation y > y + dy. 
The second gives 3 (Vpy — Vip)! ^ 0" = (Dopu) A 
0" = dp — P,O”, a gauge-dependent field. 


Metric-Affine Geometry 


A metric-affine space (M,g,w) is defined to have a 
metric and a linear connection that need not depend on 
each other. The metric alone determines the torsion- 
free Levi-Civita connection w characterized by 
: o 
dð" 十 wy ^0" 20 and Dg,,—0 


Its curvature is 
T 9 ji 9 u o p 
Q^ = du; +wp A wy 
The 1-form of type ad, 


K, = wt = wh [9] 


determines the torsion of w and the covariant 


derivative of g, 
4 V - mm 
Q^ ig y ^O), Dg, = —Ryy — Rup 


The curvature of w can be written as 


o Oo 
OF, = Q^ + Dk", EK IA K^, [10] 
The transposed connection w is defined by 
Wr, = Ww, + ae 


so that, with respect to a holonomic frame, one has 
nl =1%,. The torsion of w is opposed to that of w. 


Riemann-Cartan Geometry 


A Riemann-Cartan space is a metric-affine space 
with a connection that is metric, 


Dg =0 (11) 


The metricity condition implies that kuv + Ky, — O 
and Qu + €,,, — 0. In a Riemann-Cartan space, the 
connection is determined by its torsion O and the 
metric tensor. Let O,,, = g,,O^,,; then 


Kuv = j (Quasv o Qc P3 Q,,,)0" [12] 


The transposed connection of a Riemann—Cartan 
space is metric if and only if the tensor O,,, is 
completely antisymmetric. Let WV denote the 


covariant derivative with respect to c. By definition, 
a symmetry of a Riemann-Cartan space is a 
diffeomorphism of M preserving both g and w. The 
one-parameter group of local transformations of M, 
generated by the vector field v, consists of symme- 
tries of (M,g,w) if and only if 


Vv + V'w' =0 [13] 
and 
DV ,v" +R" 90°? = 0 [14] 


In a Riemannian space, the connections w and 所 
coincide and [14] is a consequence of the Killing 
equation [13]. The metricity condition implies 


DT wp - fugato [15] 


The Einstein-Cartan Theory of 
Gravitation 


An Identity Resulting from Local Invariance 


Let (M, g,w) be a metric-affine spacetime. Consider a 
Lagrangian L which is an invariant 4-form on M; it 
depends on g,@,w,y, and the first derivatives of 
P — q"e,. The general variation of the Lagrangian is 


6L = Ly ^ 6 +i T" bg yy + 60" ^ ty 


- 5 bu" 和 AS 十 an exact form [16] 


so that La =0 is the Euler-Lagrange equation for y. 
If the changes of the functions g,0,w, and «y are 
induced by an infinitesimal change of the frames [4], 
then óL — 0 and [16] gives the identity 

a — Q" ^ ty, +4Ds", = o” La ^ uh zx 


ap 


It follows from the identity that the two sets of 
Euler-Lagrange equations obtained by varying L 
with respect to the triples (y,0,w) and (p, g,w) are 
equivalent. In the sequel, the first triple is chosen to 
derive the field equations. 


Projective Transformations and the Metricity 
Condition 


Still under the assumption that (M, g,w) is a metric- 
affine spacetime, consider the 4-form 


81K = 1g "?n,, A Q^, [17] 


which is equal to nR, where R = g/""R,,, is the Ricci 
scalar; the Ricci tensor Rw — R^,, is, in general, 


. Pe . . 
asymmetric. The form [17] is invariant with 


respect to projective transformations of the 


connection, 
w^ e wh + 67A [18] 


where A is an arbitrary 1-form. Projectively related 
connections have the same (unparametrized) geode- 
sics. If the total Lagrangian for gravitation interact- 
ing with the matter field 2 is K + L, then the field 
equations, obtained by varying it with respect to 
ip, 0, and w are: Li — 0, 


5 PON wy ^ €", = —8rt, [19] 
and 
D(g" my) = 8ns", 20] 
respectively. Put Spy = g,,s^,. If 
Sun + Sy, = Ô [21] 


then s", — 0 and L is also invariant with respect to 
[18]. One shows that, if [21] holds, then, among the 
projectively related connections satisfying [20], there 
is precisely one that is metric. To implement 
properly the metricity condition in the variational 
principle, one can use the Palatini approach with 
constraints (Kopezynski 1975). Alternatively, fol- 
lowing Hehl, one can use [9] and [12] to eliminate w 
and obtain a Lagrangian depending on 9,60, and the 
tensor of torsion. 


The Sciama-Kibble Field Equations 


From now on the metricity condition [11] is 
assumed, so that [21] holds and the Cartan field 
equation [20] is 


Thuwp AO" = STS [22] 


Introducing the asymmetric energy-momentum ten- 
sor t,, and the spin density tensor -Symp = RS pw 
similarly as in [5], one can write the Einstein—Cartan 
equations [19] and [22] in the form given by Sciama 
and Kibble, 


Ry, 一 78wR = 8t, 23) 


A m 十 Or ve o OO a = STS y [24] 
Equation [24] can be solved to give 
Q^, = Balf pa + RS ve +5 OES oy) BS 


Therefore, torsion vanishes in the absence of spin 
and then [23] is the classical Einstein field 
equation. In particular, there is no difference 
between the Einstein and Einstein-Cartan theories 
in empty space. Since practically all tests of 
relativistic gravity are based on consideration of 
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Einstein’s equations in empty space, there is no 
difference, in this respect, between the Einstein 
and the Einstein-Cartan theories: the latter is as 
viable as the former. 

In any case, the consideration of torsion amounts 
to a slight change of the energy-momentum tensor 
that can be also obtained by the introduction of a new 
term in the Lagrangian. This observation was made in 
1950 by Weyl in the context of the Dirac equation. 

In Einstein's theory, one can also satisfactorily 
describe spinning matter without introducing tor- 
sion (Bailey and Israel 1975). 


Consequences of the Bianchi Identities: 
Conservation Laws 


Computing the covariant exterior derivatives of 
both sides of the Einstein-Cartan equations, using 
[15] and the Bianchi identities, one obtains 


8zDt, = s a T ^ Qe [26] 
and 


8rDs,, = Ma ^ XP, — nus ^Y", [27] 


Cartan required the right-hand side of [26] to 
vanish. If, instead, one uses the field equations [19] 
and [22] to evaluate the right-hand sides of [26] and 
[27], one obtains 


Dt, = OF gt ^t, — ; RP ov” AS p [28] 
and 


Ds,, € 0, Afu — 0, ^t, [29] 


Let v be a vector field generating a group of 
symmetries of the Riemann-Cartan space (M,g,w) 
so that eqns [13] and [14] hold. Equations [28] and 
[29] then imply that the 3-form 


è argy 
j = vta +} V vis, 


is closed, dj = 0. In particular, in the limit of SRT, in 
Cartesian coordinates x^, to a constant vector field v 
there corresponds the projection, onto v, of the 
energy-momentum density. If A is a constant 
bivector, then v" = A",x" gives j —j""A,,, where j#” 
is as in [6]. 


Spinning Fluid and the Generalized Mathisson- 
Papapetrou Equation of Motion 


As in classical general relativity, the right-hand sides 
of the Einstein-Cartan equations need not necessa- 
rily be derived from a variational principle; they 
may be determined by phenomenological 
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considerations. For example, following Weyssenh- 


off, consider a spinning fluid characterized by 
1” = Pu and shh = Shy? 


where $""-.-S"'"—( and u is the unit, timelike 
velocity field. Let U = u”n, so that 


m — P,U and Suv — Su, U 


Define the particle derivative of a tensor field v^ in 
the direction of u by 


pn = D(z*U) 
For a scalar field 2, the equation  — 0 is equivalent 
to the conservation law d(yU)=0. Define 


p — gu, P"w", then [29] gives an equation of motion 
of spin 

- = Myf — uu Py 
so that 

P, = puy + San” 


From [28] one obtains the equation of translatory 
motion, 


D — ) 1 
P, = (Q' "m X p — RP Sos )u" 


which is a generalization to the ECT of the 
Mathisson-Papapetrou equation for point particles 
with an intrinsic angular momentum. 


From ECT to GRT: The Effective 
Energy-Momentum Tensor 


Inside spinning matter, one can use [12] and [25] to 
eliminate torsion and replace the Sciama-Kibble 
system by a single Einstein equation with an 
effective energy-momentum tensor on the right- 
hand side. Using the split [10], one can write [23] as 


o 


R,, — ig,,R? = 8n TS [30] 


m» 


o o 
Here R,, and R are, respectively, the Ricci tensor 
and scalar formed from g. The term in [10] that is 
quadratic in & contributes to T° an expression 
quadratic in the components of the tensor s,,,, so 
that, neglecting indices, one can symbolically write 


T? =T +5" [31] 


The symmetric tensor T is the sum of t and a term 
coming from D «^ in [10]: 


Te! — tl’ 4+ IV? ,(s"no = Ed d [32] 


It is remarkable that the Belinfante-Rosenfeld 
symmetrization of the canonical energy-momentum 
tensor appears as a natural consequence of ECT. 


From the physical point of view, the second term on 
the right-hand side of [31], can be thought of as 
providing a spin-spin contact interaction, reminis- 
cent of the one appearing in the Fermi theory of 
weak interactions. 

It is clear from eqns [30]-[32] that whenever 
terms quadratic in spin can be neglected — in 
particular, in the linear approximation — ECT is 
equivalent to GRT. To obtain essentially new 
effects, the density of spin squared should be 
comparable to the density of mass. For example, to 
achieve this, a nucleon of mass m should be 
squeezed so that its radius rcart be such that 


2 
P m 
3 T ud 
P Cart r Cart 


Introducing the Compton wavelength rcompt = I? / m ~ 
10" cm, one can write 
Cart ^ (Prep) ^ 

The “Cartan radius" of the nucleon, fCat & 
10776 cm, so small when compared to its physical 
radius under normal conditions, is much larger than 
the Planck length.. Curiously enough, the energy 
l'/rca« is of the order of the energy at which, 
according to some estimates, the grand unification 
of interactions is presumed to occur. 


Cosmology with Spin and Torsion 


In the presence of spinning matter, T need not 
satisfy the positive-energy conditions, even if T does. 
Therefore, the classical singularity theorems of 
Penrose and Hawking can be overcome here. 
In ECT, there are simple cosmological solutions 
without singularities. The simplest such solution, 
found in 1973 by Kopczyński, is as follows. Consider 
a universe filled with a spinning dust such that 
P" = pu! u^ = 65,853 = 6, and S,,=0 for p 4-v 5, 
and both p and ø are functions of t=x° alone. 
These assumptions are compatible with the 
Robertson- Walker line element dt* —R(t)?(dx? + 
dy? + dz*), where (x,y,z)=(x!,x*,x°) and torsion is 
determined from [25]. The Einstein equation [23] 
reduces to the modified Friedmann equation, 


IR? —-MR'+39R* =0 [33] 


supplemented by the conservation laws of mass 
and spin, 


M = tTpR? = const., $ = SnoR = const. 


The last term on the left-hand side of [33] plays the 
role of a repulsive potential, effective at small values of 
R5 it prevents the solution from vanishing. It should be 


noted, however, that even a very small amount of 
shear in z results in a term counteracting the repulsive 
potential due to spin. Neglecting shear and making the 
(unrealistic) assumption that matter in the universe at 
t —0 consists of ~10°° nucleons of mass m with 
aligned spins, one obtains the estimate R(0) = 1 cm 
and a density of the order of m? /l^, very large, but 
much smaller than the Planck density 1/7. 

Tafel (1975) found large classes of cosmological 
solutions with a spinning fluid, admitting a group of 
symmetries transitive on the hypersurfaces of constant 
time. The models corresponding to symmetries of 
Bianchi types I, VII, and V are nonsingular, provided 
that the influence of spin exceeds that of shear. 


Summary 


ECT is a viable theory of gravitation that differs 
very slightly from the Einstein theory; the effects of 
spin and torsion can be significant only at densities 
of matter that are very high, but nevertheless much 
smaller than the Planck density at which quantum 
gravitational effects are believed to dominate. It is 
possible that ECT will prove to be a better classical 
limit of a future quantum theory of gravitation than 
the theory without torsion. 


See also: Cosmology: Mathematical Aspects; General 
Relativity: Overview. 
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Introduction 


Newton's theory of gravity with absolute time and 
Euclidean 3-space connects the gravitational poten- 
tial U with its source, the density of matter r, by the 
Poisson equation 


AU = —47nk&r 


where A is the Laplace operator and « is the 
gravitational constant. The trajectories of massive 
test particles are the flow lines of the gradient of U. 


Einstein’s Equations with Matter 195 


Bailey | and Israel W (1975) Lagrangian dynamics of spinning 
particles and polarized media in general relativity. Commu- 
nications in Mathematical Physics 42: 65-82. 

Cartan E (1923, 1924, 1925) Sur les variétés à connexion affine et 
la théorie de la relativité généralisée. Part I: Annales de l'École 
Normale Superiéure 40: 325—412. and ibid. 41: 1-25; Part Il: 
ibid. 42: 17-88; English transl. by A Magnon and A Ashtekar, 
On manifolds with an affine connection and the theory of 
general relativity. Napoli: Bibliopolis (1986). 

Cosserat EF (1909) Théorie des corps déformables. Paris: Hermann. 

Hammond RT (2002) Torsion gravity. Reports of Progress in 
Physics 65: 599—649, 

Hehl FW, von der Heyde P, Kerlick GD, and Nester JM (1976) 
General relativity with spin and torsion: foundations and 
prospects. Reviews of Modern Physics 48: 393-416. 

Hehl FW, McCrea JD, Mielke EW, and Ne'eman Y (1995) 
Metric-affine gauge theory of gravity: field equations, Noether 
identities, world spinors, and breaking of dilation invariance. 
Physics Reports 258: 1-171. 

Kibble TWB (1961) Lorentz invariance and the gravitational field. 
Journal of Matbematical Physics 2: 212-221. 

Kopezynski W (1975) The Palatini principle with constraints. 
Bulletin de l'Académie Polonaise des Sciences, Série des Sciences 
Mathématiques, Astronomiques et Physiques 23: 467-473. 

Mathisson M (1937) Neue Mechanik materieller Systeme. Acta 
Physica Polonica 6: 163-200. 

Penrose R (1983) Spinors and torsion in general relativity. 
Foundations of Pbysics 13: 325-339. 

Ruggiero ML and Tartaglia A (2003) Einstein-Cartan theory as a 
theory of defects in space-time. American Journal of Physics 
71: 1303-1313. 

Sciama DW (1962) On the analogy between charge and spin in 
general relativity. In: (volume dedicated to Infeld L) Recent 
Developments in General Relativity, pp. 415—439. Oxford: 
Pergamon and Warszawa: PWN. 

Tafel J (1975) A class of cosmological models with torsion and 
spin. Acta Physica Polonica B 6: 537—554. 

Trautman A (1973) On the structure of the Einstein—Cartan 
equations. Symposia Mathematica 12: 139-162. 

Van Nieuwenhuizen P (1981) Supergravity. Physics Reports 
68: 189-398. 


Newton’s theory has proven to be very accurate in 
the laboratory as well as in the solar system (except for 
a small discrepancy with the observed value of 
Mercury perihelion). Newton’s theory together with 
special relativity, the equivalence principle, and ideas 
of Mach, have been an inspiration for Einstein to 
uncover the equations which must be satisfied by the 
geometry of spacetime. They link the curvature of the 
spacetime metric with a phenomenological symmetric 
2-tensor T, which must represent the energy, momen- 
tum, and stresses of all the sources, by the equality: 


S(g) = Ricci(g) — 3gR(g) = 8rKT 


where Ricci(g) is the Ricci tensor of the spacetime 
metric g and R(g) its scalar curvature. The sym- 
metric 2-tensor S(g) is called the Einstein tensor. The 
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Bianchi identities, due to the invariance of curvature 
by isometries of g, imply that the divergence of the 
Einstein tensor is identically zero: the Einstein 
equations imply therefore the vanishing of the 
divergence of the source tensor T. The equations so 
obtained generalize in a relativistic context the 
conservation laws of Newtonian mechanics. In 
local spacetime coordinates x^, the Einstein equa- 
tions and conservation laws read 

Sag = Ras — gaa R = STK Lap; V. TO zo 
where V denotes the covariant derivative in the 
metric g. 

The gravitational constant « is inspired by the 
Newtonian equation relating the potential U with 
the density of matter. This equation can be obtained 
as an approximation of Einstein's equations with 
matter in the case of low velocities of matter and 
weak gravitational fields. The Newton's equation of 
motion of test particles is also an approximation of 
Einstein's geodesic motion of such particles which 
can be deduced from Einstein's equations them- 
selves. However, if one wants to remain in the 
framework of the general relativity theory, it is these 
Einstein's equations which define the mass of a 
body, there is no comparison possible with some 
fixed given mass. As length had the dimension of 
time already in special relativity, now mass is found 
to have dimension of length. We write the equations 
in geometrical units, where 87 = 1, keeping in mind 
the corresponding change to usual laboratory 
units only in specific applications. In geometrical 
units the mass of the Earth is of the order of the 
centimeter. The most precise measures of « are still 
made using Newton type experiments, giving 
k= 6.67259 x 10-1 mi kg ! s72. 

In the case of electromagnetic (or classical Yang- 
Mills) field sources, the stress energy tensor in 
special relativity is the well-known Maxwell tensor 
T (or its generalizations), whose divergence vanishes 
when the field satisfies the Maxwell (or Yang-Mills) 
equations in vacuum. The expression of this tensor 
in a curved spacetime can be trivially deduced from 
its Minkowskian form. Its expression can also. be 
deduced from the Lagrangian, and the vanishing of 
its divergence results from the invariance of this 
Lagrangian under isometries of the metric. It is the 
natural source of Einstein equations coupled with 
these fields. In the case of matter, the construction 
of a stress energy tensor is already delicate even in 
special relativity. 

The simplest models of sources with well- 
understood properties — kinetic matter and perfect 
fluids — are reviewed in this article. Physical 


situations difficult to model, even in special relativ- 
ity, dissipative fluids and elasticity, are mentioned. 
The extension to electrically, or classical Yang- 
Mills-Higgs, charged matter, offers no conceptual 
difficulty, but interesting new situations. 


Fluid Sources 


A fluid source in a domain of a spacetime (V, g) is 
such that there exists, in this domain, a unit timelike 
vector field u, satisfying g(u,u) = g,,u°u’ = —1, 
whose trajectories are the flow lines of matter. 
A moving Lorentzian orthonormal frame is called a 
proper frame if its timelike vector is 4. Since the 
Einstein gravitational potentials reduce at a point in 
a Lorentzian orthonormal frame to Minkowskian 
values, one admits that the spacetime symmetric 
2-tensor T, which embodies the density of stress, 
energy, and momentum of a given type of matter, in 
a proper frame takes the expression it would have in 
special relativity and inertial coordinates. The 
expression of T in a general frame results from its 
tensorial character and the equivalence principle. 
The problem is to find a good expression of T in 
special relativity. 


Case of Dust (Incoherent Matter) 


In a proper frame there is neither momentum nor 
stresses. Therefore, the stress energy tensor reads in 
a general frame, with r a scalar function represent- 
ing the matter density: 


T LI Tu 的 u, 1.€., Tag = Haus 


Using the property g(u,4)— —1, the conservation 
laws imply the vanishing of the divergence of the 
matter flow ru, that is, the continuity equation 
(conservation of matter) 


Velru^) — 0 


and the motion of the particles along geodesics of 
the metric: 


u Fart = 0 


Similar equations are obtained for a null dust 
model where g(u,u) — 0. 


Perfect Fluid 


Euler equations In Newtonian mechanics, a con- 
tinuous matter flow is characterized by its mass 
density and flow velocity. The equations are a 
continuity equation (conservation of matter) and 
equations of motion resulting from Newton's law, 
which link the acceleration vector and the space 


divergence of the stress symmetric 2-tensor whose 
contraction with the normal to a small 2-surface 
gives the force applied to it. A fluid is called perfect 
if the pressure it applies to a small surface element 
with normal z is independent of n. Its stress tensor t, 
symmetric 2-tensor on Euclidean space, is then 
invariant by rotations. By generalization, a relativis- 
tic fluid is called perfect if its stress energy tensor 
has the following form: 


Tag = Hag + D(gag + Uag) 


Then in a proper frame, where g takes the 
Minkowskian values and the only nonvanishing 
component of u is along the time axis and equal to 1, 
the projection of T on space is the Newtonian 
stress tensor with pressure p, while u, the projection 
of T on the time axis, is the fluid energy density. 
There is no momentum density in the proper frame. 
The conservation laws, also called Euler equations, 
are shown to split, as in the case of dust, into a 
continuity equation 


Val(u + p)u^] — u°O,p = 0 


and equations of motion 
(ut pu Vau? + (g^? + u°u’)d.p = 0 


In relativity, where mass and energy are equivalent, the 
continuity equation is no more a conservation law. 


Equations of state As in Newtonian mechanics, the 
Euler equations must be completed by a relation, 
called equation of state, depending on the physical 
properties of the fluid. In general in addition to 
mechanics, thermodynamic properties must be con- 
sidered. In relativity, they are borrowed from 
classical thermodynamics formulated in a spacetime 
context. 

In the simplest cases one introduces a conserved 
rest mass density r (or particle number density for 
particles with rest mass zero), satisfying the equation 


VP =O with Pj 


This r differs from the density of energy jz. One sets 
ju=r(1+e) and calls & the internal specific energy. 
The first law of (reversible) thermodynamics is 
extended to relativistic perfect fluids by the identity 


© dS = de + pd(r !) 


which defines both the absolute temperature O 
and the differential of the specific entropy 
S. Modulo the continuity equation and the 
thermodynamic identity, the matter conservation 
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is equivalent to the conservation of entropy along 
the flow lines: 


Ve(rSu^) 20 hence u*d,S=0 


The scalars p, j4,$, r are not independent. Simple 
situations can be modeled by an “equation of state" 
linking these quantities. In astrophysics, one is inspired 
by what is known from classical fluids, with additional 
relativistic considerations. General relativity plays a 
role in the case of strong gravitational field. 

‘Very cold matter and nuclear matter are baro- 
tropic fluids; they obey an equation of state of the 
form p — p(y). 

When the energy u is largely dominated by the 
radiation energy, the fluid is called ultrarelativistic. 
The Stefan-Boltzmann laws give j4—KT^ and 
p=(1/3)KT*, hence p—(1/3)u; the stress energy 
tensor is traceless. 

In white dwarves, the fluid is considered as 
polytropic: it obeys an equation of state of the 
form p -—f(S)r?. If only the internal energy £ and 
pressure p are dominated by radiation, then 
e=Kr'T* and p=(1/3)KT*, hence p=(1/3)re. 
The use of the thermodynamic identity leads to 
4 —4/3, p=(K/3)(38/4K)*°r43, with p=3p +r. 

For most other stars, the physical situation is too 
complex to be modeled by a simple equation; only 
tables of numerical values may be available. 

In cosmology, there is little physical informa- 
tion about the fluid which is to represent the 
energy content of the universe. It is assumed that 
in the early universe of the big-bang models, at 
very high temperature, the fluid was ultrarelati- 
vistic. At later times, it is generally assumed, for 
simplicity, that there is an equation of state linear 
and independent of entropy, p — (^ — 1)p. In order 
that the speed of sound waves be not greater than 
the speed of light, one assumes that 1 < y< 2; 
y=1 corresponds to dust, y=2 to a stiff (see 
below) fluid. 

Recent confrontations of theory and observations 
seem to imply the existence of a new, not directly 
seen, type of matter, called *dark matter." 


Wave fronts and propagation speeds The wave 
fronts of a differential system are the submani- 
folds of spacetime whose normals annul the 
characteristic determinant. Discontinuities propa- 
gate along wave fronts. For a hyperbolic system, 
the wave fronts determine the domain of depen- 
dence of a solution. For a perfect fluid, they are 
found to be 


1. the matter wave fronts, generated by the flow 
lines, such that ^7, — 0 and 
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2. the sound wave fronts, whose normals satisfy the 
equation 


D= (pp =1)( aay + pn No = 0 


in a proper frame at a point of spacetime 
u^ =65, Zag = Nag; this equation states that the 
slope of the spacetime normal to the wave front 
can be written as 


1/2 
Eim hr, 1 

"i VP, 

The sound propagation speed is the inverse of this 
slope, that is, v= ,/p’,. It is less than the speed of 
light, as expected from a relativistic theory, if ^, < 1. 
The limiting case where these speeds are equal is 
called incompressible or stiff fluid. 


Hyperbolicity, existence, and uniqueness theorem 
The characteristics of the perfect fluid equations are 
real, but the apparent multiplicity of the matter 
wave fronts poses a problem for the hyperbolicity of 
the relativistic Euler equations, even in a given 
background metric. However, Choquet-Bruhat has 
proven that this system is a hyperbolic Leray system 
as well as its coupling with the Einstein equations, 
for instance, in wave gauge. The following theorem 
can then be proved using the general theorem on 
hyperbolic systems and an extension of the method 
used for Einstein’s equations in vacuum. 


Theorem Let (M,g,K) be an initial data set for the 
Einstein equations and (i,ji,S) be Cauchy data in a 
local Sobolev space H**, s > 3, on the 3-manifold M 
for a perfect fluid with a smooth equation of state. 
Suppose ji > 0 and p; <1. There exists a globally 
hyperbolic spacetime of maximal extension solution of 
the Einstein equations with source such as perfect fluid 
taking these Cauchy data. Such a spacetime and fluid 
flow are smooth for smooth initial data. They are 
unique, up to spacetime isometries. 


The Euler equations have also been written as a 
first-order symmetric hyperbolic system by Boillat, 
Ruggeri, and Strumia using general methods relying 
on the existence of a convex functional, and directly 
by Rendall, who pointed out the difficulty of 
modeling the general motion of isolated fluid bodies, 
because of the assumption ji > 0. He constructed 
some solutions without this assumption where the 
boundaries are freely falling. The general problem of 
determining the evolution of boundaries appears 
everywhere in general relativity, and in classical 
mechanics. 


Global problems The spacetimes obtained above 
are, in general, incomplete: even in Minkowski space- 
time, the Euler equations do not in general have 
solutions that are global in time. Shocks appear in 
relativistic perfect fluids as in classical ones. Global 
existence results have been obtained for four- 
dimensional ultrarelativistic fluids (limited data), and 
in the case of 1-space dimension. A detailed study of the 
global behavior of spherically symmetric solutions of 
the Einstein-Euler equations with equation of state 
admitting a phase transition from zero pressure to stiff 
fluid has been done by Christodoulou. 


Dissipative Fluids 


A general fluid stress energy tensor is with z, a unit 
vector whose trajectories are the flow lines: 


T8 — uuu? 4: geu’ re q'u’ FE ge 


a 


with q^, = 0, WU 


p= T^? uu is the energy density, which must satisfy 
u > 0, O is a space tensor representing the stresses, 
orthogonal to 4 and q is a space vector considered as 
a heat flow. The fundamental equations are still 
Vol?’ —0, but they must be implemented by 
constitutive equations for g and O which do not 
have simple satisfactory answer in a relativistic 
context. The transfer of results from classical 
mechanics on viscous fluids or on heat transfer 
leads to propagation speeds greater than the speed 
of light. It should be remarked that these classical 
equations are obtained as governing asymptotic 
states; thus, the parabolic character of their relativis- 
tic version does not contradict relativistic causality. 
However, it would be interesting to obtain, for 
dissipative relativistic fluids, hyperbolic dissipative 
equations. Various systems have been proposed, in 
particular, by Marle by using an approximation near 
equilibrium of a solution of the relativistic Boltzmann 
equation. A promising system, also inspired from 
kinetic theory, is the *extended thermodynamics" of 
Müller and Ruggeri which takes as 14 fundamental 
unknowns, the vector P=ru and the tensor T, 
satisfying the conservation laws. These equations are 
supplemented by equations linking a totally sym- 
metric 3-tensor A with a symmetric 2-tensor I by 
equations of the form 


WAP a [^ [1] 


A and 1 are functions of P and T depending on the 
model and called constitutive equations. The system 
is shown to be symmetric hyperbolic under the 
existence of a convex entropy function, property 
which holds under appropriate physical 
assumptions. 


Reasonable equations have been proposed and 
studied for several constituent fluids and 
superfluids. 


Charged Fluids 


The stress energy tensor of a charged fluid with 
electric (or Yang—Mills) charge is generally the sum 
of the stress energy tensor of the fluid and of the 
Maxwell (or Yang-Mills) field. This tensor is 
conserved modulo the Maxwell (or Yang-Mills) 
equations with source the electric current, and the 
Euler equations completed by the Lorentz force. The 
corresponding Einstein-Maxwell perfect fluid sys- 
tem is well posed in the case of zero or infinite 
conductivity (magnetohydrodynamics). A subtlety 
appears in the case of finite conductivity: the system 
is still well posed, but for a restricted (Gevrey) class 
of C* fields. 


Kinetic Models 
Distribution Function and Moments 


A general relativistic kinetic theory can be formu- 
lated without appeal to classical mechanics or 
special relativity. The matter is composed of 
particles whose size is negligible in the considered 
scale: rarefied gases in the laboratory, galaxies or 
even clusters of galaxies at the cosmological scale. 
The number of particles is so great and their motion 
so chaotic that the state of the matter can be 
described by a “one-particle distribution function,” 
a positive scalar function on the tangent bundle to 
the spacetime (x, p) — f(x, p), which gives the mean 
number of particles with momentum f present at the 
point x of spacetime. 

The first moment of f is a causal vector field P 
defined by the integral over the space 7, of 
momenta at x, with w, a volume element in that 
space: 


P(x) =: f pF pur 


Out of the first moment, one extracts a scalar r > 0, 
interpreted as the square of a proper mass density 
given by r? =: — g(P, P) and, if r > 0, a unit vector 
u—r P interpreted as the macroscopic flow 
velocity. 

The second moment of the distribution function f 
is the symmetric 2-tensor on spacetime given by 


T(x) =: 4 f (x. p)p & pup 


It is interpreted as the stress energy tensor of the 
distribution f. Higher moments are defined similarly. 
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Liouville-Vlasov Equation 


When the gas is so rarefied that the particle 
trajectories do not cross, then in the absence of 
nongravitational forces, these trajectories are geode- 
sics of g, orbits in TV of the vector field 
X — (p^, Q^ = —T$,p^p") with T$, the Christoffel 
symbols of g. 

In a collisionless model, the physical law of 
conservation of particles imposes the conservation 
of f along the trajectories of X, that is, the Liouville- 
Vlasov equation 

Of Of 


xf m pst QU. = 0 


Conservation laws If f satisfies the Vlasov equa- 
tion, then all moments satisfy a conservation law, in 
particular, 


VaP® = 0 and VaT — 0 


equations which make the Einstein-Vlasov system 
consistent. 

The theory extends without problem to particles 
having the same rest mass m, because the scalar 
g(p, p) = —m? is constant on a geodesic. 


Cauchy problem The Einstein-Vlasov system is an 
integro-differential system for g and f on a manifold 
V=M xR. The Cauchy data for the spacetime 
metric g on Mo — M x {0} is, as usual, a pair (g, K), 
implemented with gauge initial data which complete 
the definition of Cauchy data for a well-posed 
hyperbolic system in the chosen gauge. The Cauchy 
data for f are a function f on the bundle Py,. It has 
been proved long ago that there exists a solution, 
geometrically unique, in a neighborhood of Mo if 
the data are in Sobolev spaces, weighted by a power 
of p? in the case of f. 

Since the Vlasov matter model, solution of a 
linear equation for given g, has no singularity by 
itself, the Einstein- Vlasov system is a good candi- 
date for solutions that are global in time. This global 
existence has been proved by Rein and Rendall in 
the case of small data, asymptotically flat with 
spherical symmetry or plane symmetry, or with 
hyperbolic symmetry and compact space. Global 
existence without these symmetries is an open 
problem. 


Boltzmann Equation 


When the particles undergo collisions, their trajec- 
tories in phase space are no more connected integral 
curves of the vector field X, that is, their moment 
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undergoes a jump with the crossing of another 
trajectory. In the Boltzmann model, the derivative 
Lyf is equal to the so-called collision operator, Tf: 


(Lxf)(x,p) = (Zf)(x.p) 


where Zf is an integral operator linked with the 
probability that two particles of momentum, respec- 
tively, p' and g’, collide at x and give, after the shock, 
two particles of momentum p and g. For “elastic” 
shocks, the total momentum is conserved, that is, 
p' and gq lie in the submanifold Epa =: (p' + q' = 
p +q}, with volume element £' and 


(f)x,p)- 人 [ f(x, bf Gc, q^) 


— f(x, p)yf (x, a)]AGG b. q. p g )é ^u, 


The function A(x, p, q, p', q') is called the shock 
cross section; it is a phenomenological quantity. No 
explicit expression is known for it in relativity. 
A generally admitted property is the reversibility of 
elastic shocks, A(x, p, 4, p', q') - A(x, p', q', p, d): 
It can be proved that under this hypothesis, the 
first and second moment of f are conserved as in the 
collisionless case, making the Einstein-Boltzmann 
system consistent. Existence of solutions (that are 
local in time) of the Cauchy problem for this system 
has long been known. No global existence for the 
coupled system is known yet. 

One defines, in a relativistic context, an entropy 
flux vector H which is proved to satisfy an 
H-theorem, that is, V,H^ > 0. In an expanding 
universe, for instance, Robertson Walker, where H 
depends only on time and an entropy density is 
defined by H?, one finds that a decrease in entropy 
is linked with the expansion of the universe, thus 
permitting its ever-increasing organization from an 
initial anisotropy of f in momentum space. 


Other Matter Sources 
Elastic Media 


There are no solids in general relativity; in special 
relativity rigid motions are already very restricted. 
A theory of elastic deformations can only be defined 
relatively to some a priori given state of matter 
whose perturbations will satisfy laws analogous to 
the classical laws. Various such theories have been 
proposed through geometric considerations, extend- 
ing methods of classical elasticity; they have been 
used to predict the possible signals from bar 
detectors of gravitational waves, or the motions in 
the crust of neutron stars. A general theory 
constructed by Lagrangian formalism has recently 
been developed. 


Spinor Sources 


A symmetric stress energy tensor can be associated 
to classical spinors of spin 1/2, leading to a well- 
posed Einstein-Dirac system. The theories of super- 
gravity couple the Einstein-Cartan equations with 
anticommuting spin 3/2 sources. 


See also: Boltzmann Equation (Classical and Quantum); 
Einstein Equations: Exact Solutions; Einstein Equations: 
Initial Value Formulation; General Relativity: Overview; 
Geometric Analysis and General Relativity; Kinetic 
Equations; Spinors and Spin Coefficients. 
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Introduction 


Classical electromagnetism is described by Max- 
well's equations, which, in 3-vector notation and 
corresponding respectively to the laws of Coulomb, 
Ampére, Gauss, and Faraday, are given by eqns 


[1a]-[1d]: 


divE — p [1a] 
curlB 一 元 一 ] [1b] 
div B = 0 [1c] 
curl E + s —0 [1d| 


Equivalently, in covariant 4-vector notation, these 
correspond to eqns [2a] and [2b]: 


0, FY = =j [2a] 
a" FH” = 0 [2b] 


In eqns [1], E and B are the electric and magnetic 
fields, respectively, p is the electric charge density, 
and J is the electric current. In eqns [2], F,, is the 
field tensor, *F,,,, the dual field tensor, and j” is the 
4-current, related to the previous vector quantities 
by the following relations: 


0 Fy E> E3 
— Ei 0 —B; B; 


Fw 
=E; B. 0 —B, 
afo -dHa a Bh 0 
0 B, B5 B; 
T =B} 0 E; —E 
-— B; -—Es 0 E 
— B; E; —Ei 0 

y ={p,J) 


Throughout this article, we shall denote the three 
spatial indices by lower-case Latin letters such as i, /, 
while Greek indices such as pu, v denote spacetime 
indices running through 0,1,2,3. The Einstein 
summation convention is used, whereby repeated 
indices are summed. Spacetime indices are raised 


Electric-Magnetic Duality 201 


and lowered by the (flat) Minkowski metric 
Zw = diag(1, — 1, — 1, — 1). We also use units con- 
ventional in particle physics, in which the reduced 
Planck constant b and the speed of light c are both 
set to 1. 

In terms of the totally skew symmetric symbol 
supr (With £9123 — 1), the two field tensors are 
related by eqn [3]: 


Eu, = Jepp” E 


We say that *F,, is the dual of F,,, and eqn [3] is 
indeed a duality relation because eqn [4] holds, 
which means that up to a sign, F,, and *F,, are 
duals of each other: 


“CF) = -F 4 


This duality is in fact the Hodge duality between 
p-forms and (n -— p)-forms in an n-dimensional 
space. In our particular case, p —2 and n=4, so 
that both F and its dual are 2-forms. The minus sign 
in eqn [4] comes about because of the Lorentzian (or 
pseudo-Riemannian) signature of Minkowski 
spacetime. 

The physical significance of this duality is that 
such a symmetry interchanges electric and magnetic 
fields (again up to sign) (eqn [5]), as can be seen 
from the matrix representation of F,, and 'F,, 
above: 


" EX B, Be —E [5] 


Now in the absence of electric charges and 
currents, one sees immediately that Maxwell’s 
equations [1] or [2] are dual symmetric. This 
means that, in vacuo, whether we call an electro- 
magnetic field electric or magnetic is a matter of 
convention. As far as the dynamics is concerned, 
there is no distinction. 

On the other hand, eqns [1] and [2] as presented, 
that is, in the presence of matter, are manifestly not 
dual symmetric. The underlying reason for this 
asymmetry has been much studied both in physics 
and in mathematics. One of the two questions that 
this article addresses is precisely this. Following on 
this, we shall see what happens if we try somehow 
to restore this dual symmetry even in the presence of 
matter. 

The second question that we wish to discuss is a 
generalization of this duality. Electromagnetism is a 
gauge theory, in which the gauge group is the 
abelian circle group U(1), representing the phase of 
wave-functions in quantum mechanics. A physically 
relevant generalization, in which the abelian U(1) is 
replaced by a nonabelian group (e.g., SU(2), SU(3)) 
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is called Yang-Mills theory (Yang and Mills 1954), 
which is the theoretical basis of all modern particle 
physics. We shall show in this article how the 
concept of electric-magnetic duality can be general- 
ized in the context of Yang-Mills theory. 


Gauge Invariance, Sources, 
and Monopoles 


Electric-magnetic duality, whether in the well-known 
abelian case or in the still somewhat open nonabelian 
case, is intimately connected with gauge invariance, 
sources, and monopoles, and also the dynamics as 
embodied in the gauge action. These questions in 
turn find their natural setting in differential geometry, 
particularly the geometry of fibre bundles. 

Although classical electrodynamics can be fully 
described by the field tensor F,,,, one needs to 
introduce the electromagnetic (or gauge) potential 
A, if one considers quantum mechanics, as has 
been beautifully demonstrated by the Bohm- 
Aharonov experiment. The two quantities are 
related by eqn [6]: 


F(x) = 9,A,(x) — 0,A,(x) [6] 


The fact that the phase of a wave function w(x) (e.g., 
of the electron) is not a measurable quantity 
(although relative phases of course are) implies that 
we are free to make the following transformation: 


a(x) — ee (x) [7] 


This in turn implies an unobservable transformation 
[8] on the gauge potential, where A(x) is a real- 
valued function on spacetime: 


A,(x) = Ay(x) + 9, (x) [8] 


This invariance is called gauge invariance. Since in 
this abelian case F,, is gauge invariant, so are the 
Maxwell equations, for which we shall take from 
now on the covariant form [2]. Inasmuch as the 
Maxwell equations dictate the dynamics of electro- 
magnetism, gauge invariance is an intrinsic ingredi- 
ent even in the classical theory. 

In Yang-Mills theory, the U(1) phase e'^** is 
replaced by an element S(x) of a nonabelian group 
G, so that eqns [7], [8], and [6] become, respec- 
tively, eqns [9], [10], and [11]: 


w(x)  S(x)v(x) [9] 


A,(x) 5 S(x)A,(x)S ^! (x) 一 (<) 8,S(x)S !(x) [10] 


Fy (x) = O,Ay(x) — 9,A (x) + ig[A (x), Av(x)] [11] 


Here the electric coupling e is replaced by a general 
gauge coupling g. The quantities A,, and F,,,, now take 
values in the Lie algebra of the Lie group G and the 
bracket is the Lie bracket. The wave function w(x) 
takes values in a vector space on which an appropriate 
representation of G acts. Notice that now the field 
tensor F,,,, is no longer invariant, but only covariant: 


F(x) ++ S(x)Fu(x)S ! (x) [12] 


Next we consider the charges of gauge theory. For 
the moment, we wish to distinguish between two 
types of charges: sources and monopoles. These are 
defined with respect to the gauge field, which in turn 
is derivable from the gauge potential. 

Source charges are those charges that give rise to a 
nonvanishing divergence of the field. For example, the 
electric current j due to the presence of the electric charge 
e occurs on the right-hand side of the first Maxwell 
equation, and is given in the quantum case by eqn [13], 
where ^/" is a Dirac gamma matrix, identifiable as a basis 
element of the Clifford algebra over spacetime: 


j” = ev [13] 


In the Yang-Mills case, the first Maxwell equation 
is replaced by the Yang-Mills equation 


D,P"—-P, j-gbyv [4 


We define the covariant derivative D as in 


DF” = 9,F"" — ig[A,, F”) 15] 


Monopole charges, on the other hand, are 
topological obstructions specified geometrically by 
nontrivial G-bundles over every 2-sphere $? sur- 
rounding the charge. They are classified by elements 
of mı(G), the fundamental group of G. They are 
typified by the (abelian) magnetic monopole as first 
discussed by Dirac in 1931. 

Let us go into a little more detail about the Dirac 
magnetic monopole. If the field tensor F,,, does come 
from a gauge potential A,, as in eqn [6], then simple 
algebra will tell us that this implies 9,* F"" — 0 as in 
eqn [2]. Hence, we conclude the following: 


j monopole — A, cannot be well defined 
everywhere 


The result is actually stronger. Suppose there exists a 
magnetic monopole at a certain point in spacetime, 
and, without loss of generality, we shall consider a 
static monopole. If we surround this point by a 
(spatial) 2-sphere X, then the magnetic flux out of 
the sphere is given by 


f| 9:35 - f| B-40+ f| Bdo [16] 


Here SN and X? are the northern and southern 
hemispheres overlapping on the equator S. By 
Stokes’ theorem, since F,» has no components 
Fo; = E we have 


/| B.do= $ A-ds [17a] 
ZN Js 
/| B-do= $ Ads 17b 
JES -Š 
In eqn [17b], —S means the equator with 


the opposite orientation. Hence, f+ $ , —0. 
But this contradicts the assumption that there 
exists a magnetic monopole at the center of 
the sphere. Hence, we see that if a monopole 
exists, then A, will have at least a string of 
singularities leading out of it. This is the famous 
Dirac string. 

The more mathematically elegant way to describe 
this is that the principal bundle corresponding to 
electromagnetism with a magnetic monopole is 
nontrivial, so that the gauge potential A, has to be 
patched (i.e., related by transition functions in the 
overlap). Consider the example of a static monopole 
of magnetic charge ë. For any (spatial) sphere S, of 
radius r surrounding the monopole, we cover it with 
two patches N, S as follows: 


(Nj O<@<2,0<@<27 
(Si; OLUT, Oo Sr 


In each patch we define the following: 


(N) ey 

; Anr(r + z) 
AU 一 ex 

2 4nr(r +z) 
AP aD 
A” m i ey 

l Anr(r — z) 
AO u ex 

? — Amr(r — z) 
AP =0 


In the overlap (containing the equator), AU" 
and A‘) are related by a gauge transformation: 


(S un (Y) — 2¢ 18 
"T (=) P (72 


Notice that AMI has a line of singularity along 
the negative z-axis (which is the Dirac string 
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in this case); similarly for AY along the positive 
z-axis. 

Furthermore, the corresponding field strength is 
given by 


E= 0 [19a] 
er 
a 4rr? [13b] 


If we now evaluate the “magnetic flux” out of S,, we 
have 


[[5:4-4. (ANAS) de"=@ gol 
S, Equator 


In other words, in the presence of a magnetic 
monopole, the second half of Maxwell's equations 
is modified according to eqn [21], with j^ given by 
eqn [22]. 


dv B— p i 
ƏB ~ f3 a, F^ m =r [21] 
|]E--— = 
curl E + ar J 
jt = bby" [22] 


Furthermore, the form of eqn [21] tells us that a 
monopole of the F,» field can also be considered as a 
source of the *F,, field. The two descriptions are 
equivalent. 

How are the charges e and é related? The gauge 
transformation $—e'^ relating AN and A must 
be well defined; that is, if one goes round the 
equator once, ġo — 0 — 27, one should get the same 
S. This gives 


ee = 270, 


ncz [23] 


In particular, the unit electric and magnetic charges 
are related by eqn [24], which is Dirac's quantiza- 
tion condition, 


ee = 2r [24] 


So, in principle, just as in the electric case, where we 
could have charges e,2e,..., here we could also 
have magnetic charges of @,2é,... . In other words, 
both charges are quantized. 

Another way to look at this is to consider the 
classification of principal bundles over $?. The 
reason for these topological 2-spheres is that we 
are interested in enclosing a point charge. For a 
nontrivial bundle, the patching is given by a function 
S defined in the overlap (the equator), in other words, a 
map S! 一 U(1). What this amounts to is a closed 
curve in the circle group U(1). Now, curves that can be 
continuously deformed into one another cannot give 
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distinct fibre bundles, so that one sees easily that there 
exists a one-to-one correspondence: 


[principal U(1) bundles over $°} 
| 


{homotopy classes of closed curves in U(1)} 


This last is 7;(U(1)) = Z. Hence, we recover Dirac's 
quantization condition. 

So, for electromagnetism, there are two equivalent 
ways of defining the magnetic charge, as a source or 
as a monopole: 


1. ð * F” 2 —j" x né £0. 
2. An element of 7;(U(1)) & Z. 


The same goes for the electric charge. We also note 
that both definitions give us the fact that these 
charges are discrete (quantized) and conserved 
(invariant under continuous deformations). 

We now want to apply similar considerations 
to the magnetic charges in the nonabelian case. 
For several (subtle) reasons the obvious expression 
D,* Fe” = —j# as a source (see Table 1) does not 
work. The quickest way to say this is that * F^" in 
general has no corresponding potential A,, and so is 
not a gauge field. Moreover, in contrast to the 
abelian case, the field tensor does not fully 
specify the physical field configuration, as demon- 
strated by Wu and Yang. We shall come back to 
this later. 

But we have just seen that in the abelian case 
there is another equivalent definition, which is 
that a magnetic monopole is given by the gauge 
configuration corresponding to a nontrivial U(1) 
bundle over S*. This can be generalized to the 
nonabelian case without any problem. Moreover, 
this definition. automatically guarantees that a 
nonabelian monopole charge is quantized and 
conserved. This is the way monopoles are defined 
above. 

Arguments similar to the abelian case easily yield 
the nonabelian analog of the Dirac quantization 
condition, eqn [25], the difference between the two 
cases being only a matter of conventional 
normalization. | 


gg = 4r [25] 


Table 1 Definitions of charges 


Sources Monopoles 
Abelian ð F" = —jr ð Fe =f 
Nonabelian D, FY — —j" ? 


Abelian Duality and the Wu-Yang 
Criterion 


We saw above the well-known fact that classical 
Maxwell theory is invariant under the duality opera- 
tor. By this we mean that at any point in spacetime 
free of electric and magnetic charges we have the two 
dual symmetric Maxwell equations: 


3“ F” =0. [dF=0) [26] 


FY —0 [d'F- 0) [27] 
Displayed in square brackets are the equivalent 
equations in the language of differential forms. Then 
by the Poincaré lemma we deduce immediately the 
existence of potentials A and A such that eqns [28] 
and [29] hold: 


F(x) = 0,A,(x) —0,A,(x)  [F-—dA] [28] 


*F,(x) = O,A,(x) -8,A, (x)  l'F—-dA] [29] 


The two potentials transform independently under 
independent gauge transformations A and A: 


Ay (x) = Ay (x) + QA (x) 30] 


Á (x)= A, (x) + 8, A (x) [31] 


This means that the full symmetry of this theory is 
doubled to U(1) x U(1), where the tilde on the 
second circle group indicates that it is the symmetry 
of the dual potential A. It is important to note that 
the physical degrees of freedom remain the same. 
This is clear because F and *F are related by an 
algebraic equation [3]. As a consequence, the 
physical theory is the same: the doubled gauge 
symmetry is there all the time but is just not so 
readily detected. 

As mentioned in the Introduction, this dual 
symmetry means that what we call “electric” or 
“magnetic” is entirely a matter of choice. 

In the presence of electric charges, the Maxwell 
equations usually appear as 


0, FY” = 0 [32] 
0, FY = —jh [33] 


The apparent asymmetry in these equations comes 
from the experimental fact that there is only one 
type of charges observed in nature, which we choose 
to regard as a source of the field F (or, equivalently 
but unconventionally, as a monopole of the field *F). 
But as we see by dualizing eqns [32] and [33], 
that is, by interchanging the role of electricity and 
magnetism in relation to F, we could equally have 
thought of these instead as source charges of 


the field *F (or, similarly to the above, as monopoles 
of F): 


O,,* F” = =f} [34] 
BE = 0 [35] 


If both electric and magnetic charges existed in 
nature, then we would have the dual symmetric pair: 


0, F"" = —j [36] 
OQ, F"" = —" [37] 


This duality in fact goes much deeper, as can be 
seen if we use the Wu—-Yang criterion to derive the 
Maxwell equations, although we should note that 
what we present here is not the textbook derivation 
of the Maxwell equations from an action, but we 
conisder this method to be much more intrinsic and 
geometric. Consider first pure electromagnetism. 
The free Maxwell action is given by 


AL = E J Fp P” [38] 


The true variables of the (quantum) theory are the 
A,, so in eqn [38] we should put in a constraint to 
say that F,, is the curl of A, [28]. This can be 
viewed as a topological constraint, because it is 
precisely equivalent to [26]. Using the method of 
Lagrange multipliers, we form the constrained 
action 


A — A? «I J MO Fe’) [39] 


We can now vary this with respect to Pu obtaining 
eqn [40], which implies [27]: 


ph" — 2gvpo a.» [40] 


Moreover, the Lagrange multiplier A is exactly the 
dual potential A. 

This derivation is entirely dual symmetric, since 
we can equally well use [27] as constraint for the 
action A? now considered as a functional of * Fi” 
(eqn [41]), and obtain [26] as the equation of motion: 


Ap = : / d y ad [41] 

This method applies to the interaction of charges 
and fields as well. In this case we start with the free 
field plus free particle action (eqn [42]), where we 
assume the free particle m to satisfy the Dirac 
equation, 


AD = Ad + J VIO — m)v [42] 
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To fix ideas, let us regard this particle carrying an 
electric charge e as a monopole of the potential A,,. 
Then the constraint we put in is [33], giving 


A = A + | A,(O, F"" + j^") [43] 


Variation with respect to *F gives eqn [32], and 
varying with respect to v gives 


(ids — mo = —eA "Y A4 


So, the complete set of equations for a Dirac particle 
carrying an electric charge e in an electromagnetic 
field is [32], [33], and [44]. The duals of these 
equations will describe the dynamics of a Dirac 
magnetic monopole in an electromagnetic field. 

We see from this that the Wu-Yang criterion 
actually gives us an intuitively clear picture of 
interactions. The assertion that there is a monopole 
at a certain spacetime point x means that the gauge 
field on a 2-sphere surrounding x has to have a 
certain topological configuration (e.g., giving a 
nontrivial bundle of a particular class), and if the 
monopole moves to another point then the gauge 
field will have to rearrange itself so as to maintain 
the same topological configuration around the new 
point. There is thus naturally a coupling between the 
gauge field and the position of the monopole, or, in 
physical language, a topologically induced interac- 
tion between the field and the charge (Wu and 
Yang, 1976). Furthermore, this treatment of inter- 
action between field and matter is entirely dual 
symmetric. 

As a side remark, consider that although the 
action A? is not immediately identifiable as geo- 
metric in nature, the Wu- Yang criterion, by putting 
the topological constraint and the equation of 
motion on equal (or dual) footing, suggests that in 
fact it is geometric in a subtle manner not yet fully 
understood. Moreover, as pointed out, eqn [40] says 
that the dual potential is given by the Lagrange 
multiplier of the constrained action. 


Nonabelian Duality Using Loop Variables 


The next natural step is to generalize this duality to 
the nonabelian Yang-Mills case. Although there is 
no difficulty in defining *F"", which is again given 
by [3], we immediately come to difficulties in the 
relation between field and potential; for example, as 
in eqn [11], 


F,/(x) = OAy(x) — 0, Ax)  ig[Au (x), Av(x)] 


First of all, despite appearances the Yang-Mills 
equation [45] (in the free-field case) and the Bianchi 
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identity [46] are not dual-symmetric, because the 
correct dual of the Yang—Mills equation ought to be 
given by eqn [47], where D, is the covariant 
derivative corresponding to a dual potential: 


DF” = 0 [45] 
D,  F = 0 [46] 
DF" = 0 [47] 


Secondly, the Yang-Mills equation, unlike its 
abelian counterpart [27], says nothing about 
whether the 2-form *F is closed or not. Nor is the 
relation [11] about exactness at all. In other words, 
the Yang-Mills equation does not guarantee the 
existence of a dual potential, in contrast to the 
Maxwell case. In fact, Gu and Yang have con- 
structed a counterexample. Because the true vari- 
ables of a gauge theory are the potentials and not 
the fields, this means that Yang-Mills theory is not 
symmetric under the Hodge star operation [3]. 

Nevertheless, electric-magnetic duality is a very 
useful physical concept, so one may wish to seek a 
more general duality transform (^), satisfying the 
following. properties: 


L( )=#( ) 

2. Electric field Fu 73 magnetic field Fp». 

3. Both A, and A, exist as potentials (away from 
charges). 

4. Magnetic charges are monopoles of A,, and 
electric charges are monopoles of A,. 

5. ~ reduces to * in the abelian case. 


One way to do this is to study the Wu—Yang 
criterion more closely. This reveals the concept of 
charges as topological constraints to be crucial 
even in the pure field case, as can be seen in 
Figure 1. The point to stress is that, in the above 
abelian case, the condition for the absence of a 
topological charge (a monopole) exactly removes 
the redundancy of the variables F,,,, and hence 
recovers the potential A,,. 


A, exists as 


Defining constraint |, 
人 一 
[dF=0] 


| Gauss 


No magnetic 
monopole & 


potential for Fy Poincaré 


[F- dA] 


Principal A, 


bundle trivial p 


Geometry Physics 


Figure 1 


Now the nonabelian monopole charge was defined 
topologically as an element of 74(G), and this 
definition also holds in the abelian case of U(1), with 
71(U(1)) — Z. So the first task is to write down a 
condition for the absence of a nonabelian monopole. 

To fix ideas, let us consider the group SO(3), 
whose monopole charges are elements of Z5, which | 
can be denoted by a sign +. The vacuum, charge (+) 
(that is, no monopole) is represented by a closed 
curve in the group manifold of even winding 
number, and the monopole charge (—) by a closed 
curve of odd winding number. It is more convenient, 
however, to work in SU(2), which is the double 
cover of SO(3) and which has the topology of S*, as 
sometimes it is useful to identify the fundamental 
group of SO(3) with the center of SU(2) and hence 
consider the monopole charge as an element of this 
center. There the charge (+) is represented by a 
closed curve, and the charge (—) by a curve that 
winds an odd number of “half-times” round the 
sphere S°. Since these charges are defined by closed 
curves, it is reasonable to try to write the constraint 
in terms of loop variables. The treatment presented 
below is not as rigorous as some others, but the 
latter are not so well adapted to the problem in 
hand. Furthermore, it is important to emphasize that 
this approach aims to generalize electrico magnetic 
duality to Yang-Mills theory in direct and close 
analogy to duality in electromagnetism, without any 
further symmetries with which it may be expedient 
to enrich the theory. Other approaches are referred 
to in the next section. 

Consider the gauge-invariant Dirac phase factor 
(or holonomy) ®(C) of a loop C, which can be 
written symbollically as a path-ordered exponential: 


2m 
[t] = P, expig l ds A,,(E(s))E"(s) [48] 


In eqn [48], we parametrize the loop C as is eqn [49] 
and a dot denotes differentiation with respect to the 
parameter s. 


C: {é(s): s = 0 — 2r, €(0) = &(2x) = &) [49] 


We thus regard loop variables in general as 
functionals of continuous piecewise smooth func- 
tions € of s. In this way, loop derivatives and loop 
integrals are just functional derivatives 
and functional integrals. This means that loop 
derivatives ó,(s) are defined by a regularization 
procedure approximating delta functions with 
finite bump functions and then taking limits in a 
definite order. For functional integrals, there exist 
various regularization procedures, which are treated 
elsewhere in this Encyclopedia. 


Polyakov (1980) introduces the logarithmic loop 
derivative of ®[€]: 


Fulls] =O" KND 50] 


This acts as a kind of *connection" in loop space 
since it tells us how the phase of ®[£] changes from 
one loop to a neighbouring loop. One can go a step 
further and define its “curvature” in direct analogy 
with F,,(x) by 


Cu 人 sl =6,(s)F, [Els] — ó,(s)F,|E|s] 
+ ig[F, [£s], F,léls]] [51] 


It can be shown that by using the F,[£|s] we can 
rewrite the Yang-Mills action as eqn [52], where the 
normalization factor N is an infinite constant: 


Ac 
-zx sé | ds tr{F,,[E|s|F"[Els]}|E(s)|- 


[52] 


However, the true variables of the theory are still 
the A„. They represent 4 functions of a real variable, 
whereas the loop connections represent 4 functionals 
of the real function £(s). Just as in the case of the F,,,, 
these F,[£|s] have to be constrained so as to recover 
A,, but this time much more severely. 

It turns out that, in pure Yang-Mills theory, the 
constraint that says there are no monopoles ([53]) 
also removes the redundancy of the loop variables, 
exactly as in the abelian case, 


Gwléls] = 0 [53 


That this condition is necessary is easy to see by 
simple algebra. The proof of the converse of this 
"extended Poincaré lemma" is fairly lengthy. Granted 
this, we can now apply the Wu- Yang criterion to the 
action [52] and derive the Polyakov equation [54], 
which is the loop version of the Yang-Mills equation: 


6, (s)F"[£|s] = 0 [54] 


Ap = 


In the presence of a monopole charge (—), the 
constraint [53] will have a nonzero right-hand side, 


G,l&|s] = —Juv|&]s] [55] 


The loop current /,,[£|]] can be written down 
explicitly. However, its global form is much easier 
to understand. Recall that F"[£|s] can be thought of 
as a loop connection, for which we can form its 
“holonomy.” This is defined for a closed (spatial) 
surface X (enclosing the monopole), parametrized by 
a family of closed curves £,(s),2£—0-— 2r. The 
“holonomy” Os is then the total change in phase 
of $[£,] as t — 27, and thus equals the charge (一 ). 
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To formulate an electric- magnetic duality that is 
applicable to nonabelian theory, one defines yet 
another set of loop variables. Instead of the Dirac 
phase factor [E] for a complete curve [48], we 
consider the parallel phase transport for part of a 
curve from s, to s»: 


Be(s2, 51) = P; exp is | A dsA,,(E(s))EX(s) [56] 
Then the new variables are defined by [57]. 
Eléls| = ®e(s, O)F,, [£|s] b; ' (s, 0) [57] 


These are not gauge invariant like F,[£|s] and may 
not be as useful in general, but seem more 
convenient for dealing with duality. 

Using these variables, we now define their dual 
E, [n|t] according to 


ur! (n(t))E, nte) 


2 i : 5—2 
-L epii (t) J 6€ dsE? [£|s]£" (s)? (s) 


x é(&(s) — n(t)) [58] 


In eqn [58], w(x) is a (local) rotation matrix 
transforming from the frame in which the orientation 
in internal symmetry space of the fields E,[£|s| are 
measured to the frame in which the dual fields E, [j|] 
are measured. It can be shown that this dual transform 
satisfies all five of the required conditions listed earlier. 

Electric-magnetic duality in Yang-Mills theory 1s 
now fully reestablished using this generalized dua- 
lity. We have the dual pairs of equations [59]-[60] 
and [61]-[62 ]: 


6,E, = 6,E, = 0 [59] 
FE, =0 [60] 
FE, =0 [61] 

6,E,, — 6,E, = 0 [62] 


Equation [59] guarantees that the potential A 
exists, and so is equivalent to [53], and hence is the 
nonabelian analog of [26]; while equation [60] is 
equivalent to the Polyakov version of Yang-Mills 
equation [54], and hence is the nonabelian analog of 
[27]. Equation [61] is equivalent by duality to [59] 
and is the dual Yang-Mills equation. Similarly 
equation [62] is equivalent to [60], and guarantees 
the existence of the dual potential A. 

The treatment of charges using the Wu-Yang 
criterion also follows the abelian case, and will not 
be further elaborated here. For this and further 
details, the reader is referred to the orginal papers 
(Chan and Tsou 1993, 1999). 
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Also, just as in the abelian case, the gauge 
symmetry is doubled: from the group G we deduce 
that the full gauge symmetry is in fact G x G, but that 
the physical degrees of freedom remain the same. 

The above exposition establishes electrico magnetic 
duality in Yang-Mills theory only for classical fields. 
A hint that this duality persists at the quantum level 
comes from the work of 't Hooft (1978) on confine- 
ment. There he introduces two loop quantities A(C) 
and B(C) that are operators in the Hilbert space of 
quantum states satisfying the commutation relation 
[63] for an SU(N) gauge theory, where z is the linking 
number between the two (spatial) loops C and C’: 


A(C)B(C) = B(C')A(C)exp(2zin/N) ^ [63] 


The order or Wilson operator is given explicitly by 
A(C) — tr (C). These two operators play dual roles 
in the sense of electric- magnetic duality: 


e A(C) measures the magnetic flux through C and 
creates electric flux along C. 

e B(C) measures the electric flux through C and 
creates magnetic flux along C. 


By defining the disorder operator B(C) as the 
Wilson operator corresponding to the dual potential 
À obtained above, one can prove the commutation 
relation [63], thus showing that these classical fields, 
when promoted to operators, retain their duality 
relation. Furthermore, there is a remarkable relation 
between the two (abstractly identical) gauge groups, 
in that if one is confined then the dual must be 
broken (that is, in the Higgs phase). This result is 
known as 't Hooft's theorem. 

The doubling of gauge symmetry, together with 
't Hooft's theorem, has been applied to the confined 
colour group SU(3) of quantum chromodynamics 
(QCD), in the Dualized Standard Model, to solve the 
puzzle of the existence of exactly three generations of 
fermions, with good observational support, by 
identifying the (necessarily broken) dual SU(3) with 
the generation symmetry (Chan and Tsou, 2002). 


Other Treatments of Nonabelian Duality 


Since Yang-Mills theory is not symmetric under the 
Hodge *-operation, there are several routes one can 
take to generalize the concept of electric- magnetic 
duality to the nonabelian case. What was presented 
in the last section is a modification of the 
“-operation so as to restore this symmetry for 
Yang-Mills theory, keeping to the original gauge 
structure as much as possible. However, Yang-Mills 
theory as used today in particle and field theories are 
usually embedded in theories with more structures. 


In the simplest case we have the Standard Model of 
Particle Physics, which describes all of particle 
interactions (except gravity) and which has the 
gauge group usually written as SU(3) x SU(2) x 
U(1), corresponding to the SU(3) of strong interac- 
tion and SU(2) x U(1) of electroweak interaction. 
[Strictly speaking, it is (SU(3) x SU(2) x U(1))/Ze, if 
we have the standard particle spectrum.] However, 
the former group is confined and the latter broken. 
The breaking is usually effected by introducing 
scalar fields called Higgs fields into the theory. 

Besides the experimentally well-tested Standard 
Model, there are many theoretically popular models 
of gauge theory in which supersymmetry is postu- 
lated, thereby introducing extra symmetries into the 
theory. Many of these are remnants of string theory, 
and are usually envisaged as gauge theories in a 
spacetime dimension higher than 4. 

Because of the extra structures and increased 
symmetries in these theories, there is quite a 
proliferation of concepts of duality, which could all 
be thought of as generalizations of abelian electric- 
magnetic duality (Schwarz, 1997). They come under 
the names of Seiberg-Witten duality, S-duality, 
T-duality, mirror symmetry, and so on. All these 
other aspects of duality have their own entries in this 
Encyclopedia. 


See also: AdS/CFT Correspondence; Duality in 
Topological Quantum Field Theory; Four-Manifold 
Invariants and Physics; Large-N Dualities; Measure on 
Loop Spaces; Mirror Symmetry: a Geometric Survey; 
Nonperturbative and Topological Aspects of Gauge 
Theory; Seiberg-Witten theory; Standard Model of 
Particle Physics. 
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Introduction 


The discovery of the electroweak theory crowned 
long years of investigation on weak interactions. 
The key earlier developments included Fermi’s 
phenomenological four-fermion interactions for the 
B-decay, discovery of parity violation and establish- 
ment of V — A structure of the weak currents, the 
Feynman-Gell-Mann conserved vector current (CVC) 
hypothesis, current algebra and its beautiful applica- 
tions in the 1960s, Cabibbo mixing and lepton-hadron 
universality, and finally, the proposal of intermediate 
vector bosons (IVBs) to mitigate the high-energy 
behavior of the pointlike Fermi's interaction theory. 

It turned out that the scattering amplitudes in IVB 
theory still generally violated unitarity, due to the 
massive vector boson propagator, 


gh E q"q" / M? 
q? — M? + ie 


The electroweak theory, known as Glashow- 
Weinberg-Salam (GWS) theory (Weinberg 1967, 
Salam 1968, Taylor 1976), was born through the 
attempts to make the hypothesis of IVBs for the 
weak interactions such that it is consistent with 
unitarity. 

The GWS theory contains, and is in a sense a 
generalization of, quantum electrodynamics (QED) 
which was earlier successfully established as the 
quantum theory of electromagnetism in interaction 
with matter. GWS theory describes the weak and 
electromagnetic interactions in a single, unified 
gauge theory with gauge group 


SU; (2) x U(1) [1] 


Part of this gauge symmetry is realized in the 
so-called “spontaneously broken” mode; only a 
Ugm(1) C SU,(2) x U(1) subgroup, corresponding 
to the usual local gauge symmetry of the electro- 
magnetism, remains manifest at low energies, with a 
massless gauge boson (photon). The other three 
gauge bosons W*,Z, are massive, with masses 
c» 80.4 and 91.2 GeV, respectively. 

The theory is renormalizable, as conjectured by 
S Weinberg and by A Salam, and subsequently 
proved by G’t Hooft (1971), and makes well- 
defined predictions order by order in perturbation 
theory. 


Since the experimental observation of neutral 
currents (a characteristic feature of the Weinberg- 
Salam theory which predicts an extra, neutral 
massive vector boson, Z, as compared to the naive 
IVB hypothesis) at Gargamelle bubble chamber at 
CERN (1973), the theory has passed a large number 
of experimental tests. The first basic confirmation 
also included the discovery of various new particles 
required by the theory: the charm quark (SLAC, 
BNL, 1974), the bottom quark (Fermilab, 1977), 
and the tau (7) lepton (SLAC, 1975). The heaviest 
top quark, having mass about two hundred times 
that of the proton, was found later (Fermilab, 1995). 
The direct observation of W and Z vector bosons 
was first made by UA1 and UA2 experiments at 
CERN (1983). 

The GWS theory is today one of the most precise 
and successful theories in physics. Even more 
important, perhaps, together with quantum chro- 
modynamics (QCD), which is a SU(3) (color) gauge 
theory describing the strong interactions (which 
bind quarks into protons and neutrons, and the 
latter two into atomic nuclei), it describes correctly 一 
within the present experimental and theoretical 
uncertainties — all the presently known fundamental 
forces in Nature, except gravity. The SU(3)gcp x 
(SU; (2) x U(1))Gws theory is known as the standard 
model (SM). 

Both the electroweak (GSW) theory and QCD 
are gauge theories with a nonabelian (noncom- 
mutative) gauge group. This type of theories, 
known as Yang-Mills theories, can be constructed 
by generalizing the well-known gauge principle 
of QED to more general group transformations. 
It is a truly remarkable fact that all of the 
fundamental forces known today (apart from 
gravity) are described by Yang-Mills theories, 
and in this sense a very nontrivial unification 
can be said to underlie the basic laws of Nature 
(G "t Hooft). 

There are further deep and remarkable conditions 
(anomaly cancellations), satisfied by the structure of 
the theory and by the charges of experimentally 
known spin-1/2 elementary particles (see Tables 1 
and 2), which guarantees the consistency of the 
theory as a quantum theory. 

It should be mentioned, however, that the recent 
discovery of neutrino oscillations (SuperKamio- 
kande (1998), SNO, KamLAND, K2K experi- 
ments), which proved the neutrinos to possess 
nonvanishing masses, clearly indicates that the 
standard GWS theory must be extended, in an as 
yet unknown way. 
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Table 1 Quarks and their charges 


Quarks SU, (2) Uy(1) Uem(1) 
2 
UL CL t 2 1 3 
di /'NSL/'Nbi s -i 
Un, Cn, İR i] 3 S 
dr, SR, Dn 1 -$ 一 


The primes indicate that the mass eigenstates are different from 
the states transforming as multiplets of SU, (2) x Uy(1). They 
are linearly related by CKM mixing matrix. 


Table 2 Leptons and their charges 


Leptons SU, (2) Uy(1) Uem(1) 
GMRME) a C4) 
Qa) \ma s\n p = 

ER; HR: TR 1 —2 -1 


The primes indicate again that the mass eigenstates are in 


different from the states transforming as multiplets of SU, (2) x 
Uy(1), as required by the observed neutrino oscillations. 


The following is a brief summary of the GWS 
theory, its characteristic features, its implications to 
the symmetries of Nature, the status of the precision 
tests, and its possible extensions. 


GWS Theory 


All the presently known elementary particles (except 
for the gauge bosons W*,Z,7, the gluons, the 
graviton, possibly right-handed neutrinos) are listed 
in Tables 1-3 together with their charges with 
respect to the SU; (2) x U(1) gauge group. 

A doublet of Higgs scalar particles is included 
even though the physical component (which should 
appear as an ordinary scalar particle) has not yet 
been experimentally observed. 

The Lagrangian is given by 


L= Leauge aP L quarks + leptons F L Higgs + LYukawa 
T Ls f. T Lghosts 


The gauge kinetic terms are 
a "3 MR Fa E os EA 1 GG" 


Table 3 Higgs doublet scalars and their charges 


Higgs doublet SU, (2) Uv(1) 


(o) | (o) 


where 


Fi, = 0,A7 — O, At + ge" ALAS 


Gu -— O,B, - o B, 
are SUL(2) x U(1) gauge field tensors; £,; and CFp 
are the so-called gauge-fixing term and Faddeev- 
Popov ghost term, needed to define the gauge-boson 
propagators appropriately and to eliminate certain 
unphysical contributions. The gauge invariance of the 


theory is ensured by a set of identities (A Slavnov, 
J C Taylor); The quark kinetic terms have the form 


Couarks = |» Viv" D, 


quarks 


where D, are appropriate covariant derivatives, 
ig’ 
<8, qr 


for the left-handed quark doublets, 


21 
XO 


D,dg = (a, + zB) dg 


1 
Diy = (2, = =o s Ay, 一 


Du Ram Q 


and similarly for other “up” quarks cg (charm) and 
tr (top), and “down” quarks, sg (strange), and bg 
(bottom). Analogously, the lepton kinetic terms are 
given by 


3 
B 2 Uy" D, 


leptons 


3 
+ x priy” (8, + ig Bu) 
where (i= 1, 2, 3) indicate the e, 1, 7 lepton families; 
finally, the parts involving the Higgs fields are 
CHiggs = Dud * D'ó + V(ó, d') 
V(6, 9!) = wl — (9h) 
and 


3 "ue at 
Yukawa = ME s qi ( po ) 


‘j=l 


aS 2 
g/gi " Ur | + h.c. 


+d lea (5 J| +h.c [2 


ij=1 


For u? < 0, the Higgs potential has a minimum at 


t + 2 0) 2 p v 
UP dir TET) eae 
By choosing conveniently the direction of the Higgs 
field, its vacuum expectation value (VEV) is expressed as 


(2) (Aa). VE 


The physical properties of Higgs and gauge 
bosons are best seen by choosing the so-called 
unitary gauge, 


(9 NY. ats | 0 
Mu VÀ Eae tenon) = OM 
= UO, =Y 


A, s Ute) (4, + an) U-* (ey. 


a 


NT 
A, = 5 


and expressing everything in terms of primed 
variables. It is easy to see that 


1. There is one physical scalar (Higgs) particle 
with mass, 


-2y* [4] 


2. The Higgs kinetic term (Dó")(Dó') produces the 
gauge-boson masses 


nm, — 


v? v? 
My- EU,  wi-tU(gheg?) Gs 
3. The physical gauge bosons are the charged W*, and 


two neutral vector bosons described by the fields 


Ln = cos 0w As, — sin OwB,, 
A, = sin Ow A3, + cos Ow B, 


where the mixing angle 


48 T. g 
bw = tan ! > sin fw = ———— 
g Vg + g? 
is known as the Weinberg angle. The massless A,, 
field describes the photon. 


Fermi Interactions and Neutral Currents 


The fermions interact with gauge bosons through 
the charge and neutral currents 


£=5 (J. pW + Jiu WE) - £^* [6] 


ges = gf A?" + Sy" 


= eft A, +—S— poz" 7] 


cos Ow 
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where 
= 2 byt V 
E 5 pyuTt (1 — ys) 
zi m [8] 
corresponds to the standard charged current, and 
= J} — sin^ 6 Je" [9] 


is the neutral current to which the Z boson is 
coupled Ts = (1/2) 3 PLYT vL and | a the 
electromagnetic current). The model thus predicts 
the existence of neutral current processes, mediated 
by the Z boson, such as v,e > v,e or P,e — v,e, with 
cross section of the same order of that for the 
charged current process, De— Ve, but with a 
characteristic L-R asymmetric couplings depending 
on the Weinberg angle. By eqn [9] appropriate ratios 
of cross sections, such as e(v,e — v,e)/o(v,e > ve), 
can be used to measure sin^ Ow. 

The exchange of heavy W bosons generates an 
effective current-current interaction at low energies: 


ce g 


= -3M3 -+ 


the well- " Mie Fermi-Feynman-Gell-Mann Lagran- 
gian -S. And V- A» With 


G 8 _ 

V2 8M$ 
This means that the Higgs VEV must be taken to be 
y —2 I GL ~ 246 GeV [10] 


It is remarkable that “all” known masses of the 
elementary particles — except perhaps those of the 
neutrino masses — are generated in GWS theory 
through the spontaneous breakdown of SU; (2) x 
U(1) symmetry, through the Higgs VEV (eqns [3] 
and [10]). The boson masses are given by [4] and 
[5]. Note that the relation 


My 


NA S EN; 
M3 cos? Ow ub 


p — 
reflects an accidental SO(3) symmetry present (note the 
SO(4) symmetry of the Higgs potential in the limit 
a— 0, before the spontaneous breaking) in the model, 
called custodial symmetry. This is a characteristic, 
model-dependent feature of the minimal model, not 
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necessarily required by the gauge symmetry. This 
relation is well met experimentally, although a quanti- 
tative discussion requires the choice of the renormaliza- 
tion scheme (including the definition of sin Ow itself) 
and check of consistency with various other data. 

The fermions get mass through the Yukawa 
interactions (eqn [2]; the fermion masses are 
arbitrary parameters of the model and cannot be 
predicted within the GWS theory. An important 
feature of this mechanism is that the coupling of the 
physical Higgs particle to each fermion is propor- 
tional to the mass of the latter. This should give a 
clear, unambiguous experimental signature for the 
Higgs scalar of the minimal GWS model. 

The recent discovery of nonvanishing neutrino 
masses requires the theory to be extended. Actually, 
there is a natural way to incorporate such masses in the 
standard GWS model, by a minimal extension. As the 
right-handed neutrinos, if they exist, are entirely 
neutral with respect to the SU;,(2) x U(1) gauge 
symmetry, they do not need its breaking to have 
mass. In other words, vg may get Majorana masses, 
^ Mgvgvng, by some yet unknown mechanism, much 
larger than those of other fermions (such a mechanism 
is quite naturally present in some grand unified 
models). If now the Yukawa couplings are introduced 
as for the quarks and for the down leptons, then the 
Dirac mass terms result upon condensation of the 
Higgs field, and the neutrino mass matrix would take 
the form, for one flavor (in the space of (v1, vg)): 


p ^3 11] 


Table 4 Quark masses 


u(MeV) c(GeV) t (GeV) d(MeV) s(MeV) b (GeV) 


1.5-4 1.15-1.35 174.3 £51 4—8 80-130 4.1-4.4 


Table 5 Leptons masses 


ve(eV) v, (MeV) v, (MeV) 
«3 « 0.19 «18.2 
e (MeV) u (MeV) r (MeV) _ 
0.510998 92 + 105.658369 + 1776.99 + 0.26 

4 x 10° 9 x 1075 
Table 6 Gauge-boson masses 
Photon Gluons W* (GeV) Z (GeV) 

0 0 80.425 + 0.038 


91.1876 + 0.0021 


If the Dirac masses are assumed to be of the same 
order of those of the quarks and if the right-handed 
Majorana masses Mer are far larger, for example, 
of the order of the grand unified scale, O(10!6 GeV), 
then diagonalization of the mass matrix would 
give, for the physical masses of the left-handed 
neutrinos, ~m$ /Mr < mp, much smaller than other 
fermion masses, quite naturally (“see-saw” mechanism). 


CKM Quark Mixing As there is a priori no reason 
why the weak-interaction eigenstates should be 
equal to the mass eigenstates, the Yukawa couplings 
in eqn [2] are in general nondiagonal matrices in the 
flavor. Suppose that the the weak base for the 
quarks is given in terms of the mass eigenstates (in 
which quark masses are made diagonal), by unitary 
transformations 


- up ~ u down - 
ULi = ) Viu, ai = ) Vj diy 
j j 


then the interaction terms with W~ bosons [6] can 
be cast in the form (Kobayashi and Maskawa 1972) 
LEXE a WU de 
Jh Tw7— CK 
ia diy W, Un nt [12] 
where UGKM = (yuri. ydown), is called Cabibbo- 
Kobayashi-Maskawa (CKM) matrix. It can be 


parametrized in terms of three Euler angles and 
one phase 


Uni Uys Uw 
U= Uca U cs Ucp 
Uta Urs Uy 
C12€13 S12C13 S136™ ^ 
= | —si2c23 — €12823813e"" C12C23 一 $12823513e"" S23C13 
$12523 一 €12€23513€ ^ C12S23 一 $12€23813€ 3 C23C13 
[13] 


where c12 = cos 612,523 = sin 055, etc. The require- 
ment that charge-current weak processes are all 
described by these matrix elements, satisfying the 
unitarity relation, 


M. US = dia [14] 


gives a very stringent test for the validity of the model. 


CP Violation 


CP (product of charge conjugation and parity 
transformation) invariance is an approximate sym- 
metry of Nature. Although it is known to be broken 
by very tiny amounts only, the exact extent and the 
nature of CP violation can have far-reaching 
consequences. 


CP violation has first been discovered by Cronin 
and Fitch (BNL, 1964) in the K-meson system; more 
precise information on the nature of CP violation 
from the neutral kaon decays has been obtained 
more recently (2000) in NA48 (CERN) and KTeV 
(Fermilab) experiments. CP violation has been 
established in the B-meson systems as well, very 
recently (2002), by Babar experiments at SLAC and 
Belle experiments at KEK. 

Through the so-called CPT theorem, CP invariance 
(or violation) is closely related to the T (time-reversal 
invariance) symmetry. Also, CP noninvariance is one 
of the conditions needed in the cosmological baryon 
number generation (baryogenesis). 

In the GWS theory, with three families of quark 
flavors (six quarks), there is just one source of CP 
violation: the phase 6,3 appearing in the CKM 
matrix (eqn [13]). For 6 # 0,2, W-exchange inter- 
actions [12] induce CP violation. The earlier and 
more recent experimental data on K-K mixing 
and Kz, s decay data appear to be compatible with 
the CKM mechanism for CP violation, but a 
quantitative comparison with the SM remains 
somewhat hindered by the difficulty of estimating 
certain strong interaction effects. The recent con- 
firmation of CP violation in B systems is made in 
the context of a global fit with the SM predictions 
such as the “unitarity triangle”? relations, for 
example, 

Und Une 


[4.6 n 
Uca Ut, 


Ura Ui 
+——_"=0 15 
UaU, à 
(eqn [14]), and by combining data from kaon deays, 
charmed meson decays, B meson decay and mixings, 
etc., and is a part of direct tests of the GWS 
model, with nonvanishing CP violation CKM 


Dm 


sin28 


Figure 1 Unitarity triangle test (Eq. (15)). The small ellipses 
represent 68% and 95% probability zones for the apex 
corresponding to Ug U's/Uca U;,. Reproduced from M. Bona et 
al. (2005) The 2004 UTfit collaboration report on the status of the 
unitarity triangle in the standard model. Journal of High Energy 
Physics. 0507: 028—059 (hep-ph/0501 199), with permission from 
loP Publishing Ltd and the UTfit collaboration. 
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phase (eqn [16] and Figure 1). Recent evidence for 
nonzero neutrino masses and mixings opens the 
way to possible CP violation in the leptonic 
processes as well. 

Finally, within the SM including strong interac- 
tions, there is one more source of CP violation: the 
so-called 0 (vacuum) parameter of QCD. 


B and L Nonconservation 


Another set of approximate symmetries in Nature are 
the baryon and lepton number conservations. In the 
electroweak theory, these global symmetries are exact 
to all orders of perturbation theory. Nonperturbative 
effects (a sort of barrier penetration in gauge field 
space) however violate both B and L; the combina- 
tion B-L is conserved even nonperturbatively though. 
The nonperturbative electroweak baryon number 
violation is an extremely tiny effect, the amplitude 
being proportional to the typical tunneling factor 
e77/^ but the process is unsuppressed at finite 
temperatures as might have been experienced by the 
universe at some early stage after big bang. 

B or L nonconservation can also arise naturally at 
high energy scales, if the electroweak theory is 
embedded as the low-energy approximation in a 
grand unified model. The experimental lower limit 
of proton lifetime, tp > 10?? years, from Kamio- 
kande experiments, however severely restricts accep- 
table models of this type (the simplest SU(5) model 
is already ruled out). 

On the other hand, cosmological baryogenesis 
requires sufficient amount of baryon number viola- 
tion, at least in some stage of cosmological expan- 
sion. Detailed analyses suggest that the standard 
electroweak transition might not in itself explain the 
baryon number np/n,~ 107° observed in the 
present universe. Recent observations of neutrino 
oscillations suggest the right-handed Majorana-type 
neutrino masses to be present, which violate the 
lepton number L. In such a case it might be possible 
that the correct amount of baryon number excess 
would be generated, through the leptogenesis. 


Global Fit 


Various relations exist at the tree level among the 
masses, scattering cross sections, decay rates, 
various asymmetries, etc., which can be read off 
or calculated from the formulas given earlier. 
These quantities receive corrections at higher 
orders, and the experimental checks of these 
modified relations provide precision tests of the 
model on the one hand, and possibly a hint for new 
physics, if there is any discrepancy with the 
prediction. Very often the amplitudes of interest 
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receive important contributions due to strong 
interactions, which are difficult to estimate. 

The basic parameters of the model, apart from the 
Higgs mass, and fermion masses and mixing 
parameters, can be taken to be (1) the fine structure 
constant, a=1/137.035 99911(46); (2) the Fermi 
constant Gr = 1.166 37 x 10? GeV ? (which can be 
determined from the muon lifetime), and the Z-boson 
mass, Mz = 91.1876 + 0.0021 GeV (observed directly 
at LEP). My and sin?6w are then calculable 
numbers, in terms of these quantities, and depending 
on m, (measured independently by CDF and DØ 
experiments at Fermilab) and on the unknown My. 

Such precision tests of the GWS model are being 
made, combining the analyses of various decay rates 
and asymmetries in B-meson systems at B factories 
and in colliders, production and decays of Z and W 
bosons, elastic ve or ve scatterings, elastic v p or v p 
scatterings, deep inelastic lepton nucleon (or deu- 
teron) scatterings, the muon anomalous magnetic 
moment, atomic parity violation experiments, etc. 

An overall fit to the data gives an excellent 
agreement, with the input parameters 


My = 113736 GeV, m, = 176.9 + 4.0 GeV, 
as(Mz) = 0.1213 + 0.0018 


For instance (in GeV), 


My = 80.390 + 0.018 ' vs. 
(exp. value (LEP)) 
Tz =2.4972++0.0012 vs. 


(exp. value) 


80.412 + 0.042 


2.4952 + 0.0023 


For sin gw (defined in the so-called MS scheme) all 
data give consistently the value 


sin? bw = 0.231 20 + 0.000 15 


(a slightly larger value is reported by an vN 
experiment at Fermilab). 

The unitarity-triangle tests of the SM and deter- 
mination of CKM matrix have already been men- 
tioned. The results of global fit can be summarized 
in Figure 1, and by the angles 


$12 = 0.2243 + 0.0016 
s23 = 0.0413 + 0.0015 


$13 = 0.037 + 0.0005 
643 = 60^ + 14° 


[16] 


For the muon anomalous gyromagnetic ratio (g — 2), 
the experimental data 


qr? = bnp = (1.116 5920(37) + 0.78) x 107? 


is to be compared with the theoretical prediction 
ath = (1.1165918(83) + 0.49) x 10? 


which is slightly smaller (1.96), where the largest 
theoretical uncertainty comes from the two-loop 
hadronic contribution que ~ (69.63 + 0.72) x 10° 
(the QED corrections to O(o?) are included). 

For further details of the analyses and the present 
status of experimental tests of the electroweak theory, 
see the reviews by J Erler and P Langacker, and by F 
J Gilman et al., cited in “Further reading” (most of 


numbers cited here come from these two reviews). 


Need for Extension of the Model 


In spite of such an impressive experimental con- 
firmation, there are reasons to believe that the 
electroweak theory, in its standard minimal form, 
is not a complete story. As already mentioned, 
neutrino oscillations, predicted earlier by Ponte- 
corvo, have recently been experimentally confirmed, 
giving uncontroversial evidence for nonvanishing 
neutrino masses and their mixing. This is a clear 
signal that the theory must be extended. If the mass 
is instead taken in the form of eqn [11] but with 
three neutrinos families, the diagonalization in 
general yields a mixing for the light neutrinos, as 
for the quarks. Some of the experimental data on the 
neutrinos are summarized in Table 7. 

In addition, the Higgs sector of the theory (the 
part of the interactions responsible for spontaneous 
breaking SU;(2) x U(1) — Ugm(1)) is still largely 
untested. The theory predicts a physical scalar 
particle, the Higgs particle, of unknown mass. The 
present-day expectation for its mass, which com- 
bines the experimental lower limit and an indirect 
upper limit following from the analysis of various 
radiative corrections, is 


114 (GeV) < my < 250 (GeV) 
This particle should be observable either in the 
Tevatron at Fermilab or in the coming LHC 
Table 7 Neutrino mass square differences and mixing 


Ve V VV, 


Ayom? = (6 — 9) x 105 ev? 
Ao, m? - (1 = x 10% ev? 
Solar neutrinos and reactor (SNO, SuperKamiokande, 


KamLAND) experiments give the first results. Atmospheric neutrino 
data and the long baseline experiment (SuperKamiokande, K2K) 
provide the second. The mixing angle relevant to the solar and 
reactor neutrino oscillation is large, tan? 642 ~ 0.407010, while the 
one related to the atmospheric neutrino data is maximal, 
sin? 2423 ~ 1. Cosmological considerations give Y m,, < O(1 eV). 


experiments at CERN; negative results would force 
upon us a substantial modification of the electro- 
weak theory. 

Last, but not least, there are a few theoretical 
motivations for an extension of the model to be 
considered necessary. First, the structure of the GWS 
theory is not entirely determined by the gauge 
principle. The form of the Higgs self-interactions, 
as well as their number and the Yukawa couplings 
of the Higgs scalar to the fermions, are uncon- 
strained by any principle, and the particular, 
minimal form assumed by Weinberg and Salam is 
yet to be confirmed experimentally. 

In addition, the theory is not really a unified 
gauge theory: SU; (2) and U(1) gauge couplings are 
distinct. One possibility is that the SU(3)ocp x 
SU; (2) x U(1) theory of the SM is actually a low- 
energy manifestation of a truly unified gauge theory — 
grand unified theory (GUT) - defined at some 
higher mass scale. The simplest version of GUT 
models based on SU(5) or SO(10) gauge groups has 
however a difficulty with the proton decay rates, 
and with the coupling-constant unification itself. 
Supersymmetric GUTs appear to be more accepta- 
ble both from the coupling-constant unification and 
from the proton lifetime constraints. 

A more subtle, but perhaps more severe theore- 
tical problem, is the so-called naturalness problem. 
At the quantum level, due to the quadratic diver- 
gences in the scalar mass, the structure of the theory 
turns out to be quite peculiar. If the ultraviolet 
cutoff of the theory is taken to be the Planck mass 
scale, Ayy ~ mp, ~ 10'? GeV, at which gravity 
becomes strongly coupled, the theory at Ayy would 
have to possess parameters which are fine-tuned 
with an excessive precision. The problem is known 
also as a “hierarchy” problem. 

A way to avoid having such a difficulty is to 
introduce supersymmetry. In a supersymmetric 
version of the standard theory — in fact, there are 
phenomenologically well-acceptable models such 
as the minimal supersymmetric standard model 
(MSSM) - this problems is absent due to the 
cancellation of bosonic and fermionic loop con- 
tributions typical of supersymmetric theories. As a 
result, the properties of the theory at low energies 
are much less sensitive to those of the theory at the 
Planck mass scale. Experiments at LHC (expected 
to be performed after 2008, CERN) should be able 
to produce a whole set of new particles associated 
with supersymmetry, if this is a part of the physical 
law beyond TeV energies. 

At a deeper level, however, the hierarchy problem 
in a more general sense persists, even in super- 
symmetric models: why the masses of the order of 
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O(100 GeV) should appear at all in a theory with a 
natural cutoff of the order of the Planck mass? 
Furthermore, if the masses of the neutrinos turn out 
to be of the order of O(10?—10?)eV, we are left 
with the problem of understanding the- large 
disparities among the quark and lepton masses, 
spanning the range of more than 13 orders of 
magnitudes: another “hierarchy” problem. 

It is also possible that the spacetime the physical 
world lives in is actually higher dimensional: the usual 
four-dimensional Minkowski spacetime times either 
compactified or uncompactified *extra dimensions." 
In theories of this type, some of the difficulties 
mentioned above might find a natural solution. It is 
yet to be seen whether a consistent theory of this type 
can be constructed that correctly account for the 
properties of the universe we inhabit. 
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Introduction 
Motivation: A Model Problem 


Many physical problems can be modeled by partial 
differential equations. Let us consider, for example, 
the case of an elastic membrane €), with fixed 
boundary I’, subject to pressure forces f. The vertical 
membrane displacement is represented by a real- 
valued function u, which solves the equation 


—Au(x)=f(x), x = (x1,x2) EQ [1] 


where the Laplace operator A is defined, in two 
dimensions, by l 
9? 
Ox 0x5 


As the membrane is glued to the curve T, z satisfies 
the condition 


u(x)—0, xeIl [2] 


The system [1]-[2] is the homogeneous Dirichlet 
problem for the Laplace operator. It enters the more 
general framework of (linear) elliptic boundary 
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value problems, which consist of a (linear) partial 
differential equation (in the example above, of order 
two: the highest order in the derivatives) inside an 
open set 2 of the whole space RY, satisfying some 
"elliptic" property, completed by (linear) conditions 
on the boundary [ of Q, called “boundary condi- 
tions." In the sequel, we only consider the linear 
case. 

Our aim is to answer the following questions: does 
this problem admit a solution? in which space? is this 
solution unique? does it depend continuously on the 
given data f? In case of positive answers, we say that 
the problem is *well posed" in the Hadamard sense. 
But other questions can also be raised, such as the sign 
of the solution, for example, or its regularity. We give a 
full survey of linear elliptic problems in a bounded or in 
an exterior domain with a sufficiently smooth bound- 
ary and in the whole space. In the general theory of the 
elliptic problem, we consider only smooth coefficients. 
We survey the standard theory, which can be found in 
the several well-known monographs of the 1960s. The 
new trends in the investigation of the elliptic problems 
is to consider more general domains with nonsmooth 
boundaries and nonsmooth coefficients. On the other 
hand, the regularity results for elliptic systems have not 
been improved during last 30 years. New trends also 
require employment of more general function spaces 
and more general functional background. 

The number of references (see “Further reading" 
section) is strictly limited here; we list only some of 
the most important publications. The basic facts can 
usually be found in more places and sometimes we 
do not mention the particular reference. Among the 


very basic references are Friedman (1969), Gilbarg 
and Trudinger (1977), Dautray and Lions (1988), 
Hórmander (1964), Ladyzhenskaya and Uraltseva 
(1968), Lions and Magenes (1968), Renardy and 
Rogers (1992), and Weinberger (1965); of course, 
there are many others. 


The Method 


To answer the above questions, we generally use, for 
such elliptic problems, an approach based on what is 
called a “variational formulation” (see the section 
“Variational approach”): the boundary-value problem 
is first transformed into a variational problem of lower 
order, which is solved in a Hilbertian frame with help of 
the Lax—Milgram theorem (based on the representation 
theorem). All questions are then solved (e.g., existence, 
uniqueness, continuity in terms of the data, regularity). 
But this variational formalism does not necessary allow 
to treat all the situations and it is limited to the 
Hilbertian case. Other strategies can then be developed, 
based on a priori estimates and duality arguments for 
the existence problem, or maximum principle for the 
question of unicity. Without forgetting the particular 
cases where an explicit Green kernel is computable 
(e.g., the Laplacian operator in the whole space case). 

Moreover, the study of linear elliptic equations is 
directly linked to the background of function spaces. It is 
the reason why we first deal with Sobolev spaces — both 
of the integer and fractional order and we survey their 
basic properties, imbedding and trace theorems. We pay 
attention to the Riesz and Bessel potentials and we 
define weighted Sobolev spaces important in the context 
of unbounded opens. Second, we present the variational 
approach and the Lax-Milgram theorem as a key point 
to solve a large class of boundary-value problems. We 
give examples: the Dirichlet and Neumann problems for 
the Poisson equation, the Newton problem for more 
general second-order operators; we also investigate 
mixed boundary conditions and present an example of 
a problem of fourth order. Then, we briefly present the 
arguments for studying general elliptic problems and 
concentrate on second-order elliptic problems; we recall 
the weak and strong maximum principle, formulate the 
Fredholm alternative and tackle the regularity questions. 
Moreover, we are interested in the existence and 
uniqueness of solution of the Laplace equation in the 
whole space and in exterior opens. Finally, we present 
some particular examples arising from physical pro- 
blems, either in fluid mechanics (the Stokes system) or in 
elasticity. 


Sobolev and Other Types of Spaces 


Throughout, Q c R^ will generally be an open 
subset of the N-dimensional Euclidean space RY. 
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A domain will be an open and connected subset of 
R. We shall use standard notations for the spaces 
LP (9), C*(Q), etc. and their norms. Let us agree 
that C*"(Q), k € N, r € (0,1), denote the space of 
functions f in C*(Q), whose derivatives D*f, a= 
(ai... an) € NX, of order la| = SN a=k are 
all r-Hólder continuous. In the notations for some of 
these spaces, by Q we mean that the functions have 
the corresponding property on Q and that they can be 
continuously extended to €). 

“Let us recall several fundamental concepts. The 
space D(Q) of the test functions in €) consists of 
all infinitely differentiable with a compact 
support in €. A locally convex topology can be 
introduced here. The elements of the dual space 
D'(Q) are called the distributions. If f € Lj (Q) 
(i.e., f € L'(K) for all compact subsets K of Q), 
then f is a regular distribution; the duality is 
represented by [. f(x)p(x)dx. If f € D'(Q), we 
define the distributional or the weak derivative 
D^ of f as the distribution  (—1)^ (f, D%y). 
Plainly, if f € Li has “classical” partial deriva- 
tives in L} „then it coincides with the correspond- 
ing weak derivative. 

If Q— RN, it is sometimes more suitable to work 
with the tempered distributions. The role of D(Q) is 
played by the space S(R) of C*-functions 
with finite pseudonorms sup |D^f(x)|(1 + |x|)‘, |a], 
k —0,1,2,.... Recall that the Fourier transform F 
maps S(R) into itself and the same is true for the 
space of the tempered distributions S'(R^). 


Sobolev Spaces of Positive Order 


The Sobolev space W^^(Q), 1 < p « oo, k€ N, is 
the space of all f € LP (NQ) whose weak derivatives up 


to order k are regular distributions belonging to 
L?(Q); in W^(Q) we introduce the norm 


1/p 
TATE (x | |D^f (x)|? «| [3] 
la|  k L 

when p «oo and maxilal<k SUP ess,.o|D^f(x)| if 
p=oco. The space W'^(Q) is a Banach space, 
separable for p < oo and reflexive for 1 < p < oc; 
it is a Hilbert space for p — 2, more simply denoted 
H"'(Q). In the following, we shall consider only the 
range p € (1,00). 

The link with the classical derivatives is given by 
this well-known fact: a function f belongs to 
W'?(Q) if and only if it is a.e. equal to a function 
it, absolutely continuous on almost all line segments 
in Q parallel to the coordinate axes, whose 
(classical) derivatives belong to L^(Q) (the Beppo- 
Levi theorem). 
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For 1 < p < oo and noninteger s > 0 the Sobolev 
space W*?(Q) of order s is defined as the space of all 
f with the finite norm 


If ll wea) = - (mz, 


ID°f (x) — Defy P Y " 
> Lae |x — y|N**is- — dx — y[N*6G-ED 


where [s] is the integer part of s (for details, see, e.g., 
Adams and Fournier (2003) and Ziemer (1989)). 


i- [s] 


Imbedding Theorems 


One of the most useful and important features of the 
functions in Sobolev spaces is an improvement of 
their integrability properties and the compactness of 
various imbeddings. Theorems of this type were first 
proved by Sobolev and Kondrashev. Let us agree that 
the symbols — and 一 一 stand for an imbedding and 
for a compact imbedding, respectively. 


Theorem 1 Let Q be a Lipschitz open. Then 


(i) If sp<N, pag WsP(Q)— L"(Q) with 
p'—Np/(N —ps) (the Sobolev exponent). If 
IO| « oo, JEn iN, target space is any L'(Q) 
with O< rap". 

If €) is bounded, then W*^(Q) 一 一 工 ?( 
和 人 

If sp > N, then W'*+?(Q)<—+ C'(Q) forj=0,1,.... 
If Q has the Lipschitz boundary, then WSP 
(Q) — C^^(Q) for j = 0,1,... and p =s—N/p. 

If sp > N, then W'*s?(Q) == C'(Q), j=0,1,... 
and W'*sP(Q) +-+W1(Q) for all 1 < q € oc. If, 
moreover, €) has the Lipschitz boundary, then the 
target space can be replaced by C! (Q) provided 
sp>N>(s—1)p andO0< p<s—N/p. 


Note that if the imbedding W*^(Q) — L4(Q) is 
compact for some q > p, then |Q| < oc. Moreover, 
if limsup, , l{x E Q;r< |x|<r+1}|>0, then 
W*P(Q) — L4(Q) cannot be compact. 


(2) for all 


—" 


(1 


Traces and Sobolev Spaces of Negative Order 


Let s > 0 and let Q be, for simplicity, a bounded open 
subset of R^ with boundary T of class Clsh!. Then 
with the help of local coordinates, we can define 
Sobolev spaces W*^(T) (also denoted H*(I) for p = 2) 
on Il'— 9€ (see, e.g., Nečas (1967) and Adams and 
Fournier (2003) for details). If f € C(Q), then fir has 
sense. Introducing the space D((2) of restrictions in Q 
of functions in D(R^), one can show that if f € D(Q), 
we have ||fir|l wi-1.0(r) < Cao so that, in view 
of the density of D(Q) in W'P?(Q), the restriction 


of f to [ can be uniquely extended to the whole 
W'P(Q). The result is the bounded trace operator 
yo: W'^(Q) 5 W!-UPP(T). Moreover, every g€ 
W!~'/PP (T) can be extended to a (nonunique) function 
f € W'*(OQ) and this extension operator is bounded 
with respect to the corresponding norms. 

More generally, let us suppose T is of class C^-'! 
and define the operator Tr, for any f € D(Q) by 


Traf = ("of , Hf, pe 3 eal) where 
of 
yif (x) = Oni eel 
LPE (°F (x) /Ax")n®, xET 


loli ^. | 


is the jth-order derivative of f with respect to the 
outer normal n at x € T; by density, this operator 
can be uniquely extended to a continuous linear 
mapping defined on the space W*?(Q); moreover, 
pl W(2)) = WE NPL), 

The kernel of this mapping is the space w?(Q) 
(denoted by H*(Q) for p=2), where W*^(Q) is 
defined as the closure of D(Q) in W%?(Q) (s > 0). For 
1<p<o, the following holds: Ws? (RN) = 
Ws? (R^), we? (Q) = W*P(O) provided O0 « s € 1 p. 
If s < 0, then the space W*?(Q) is defined as the dual 
to W-5? (Q), where p' —p/(p — 1) (see, e.g., Triebel 
(1978, 2001)). Observe that, for an arbitrary Q, a 
function f € W'^^(Q) has the zero trace if and only if 
f (x)/dist(x, T) belongs to L?(Q). 

For p —2, we simply denote by H~*(Q) the dual 
space of H&(Q)). In the case of bounded opens, we recall 
the following useful Poincaré-Friedrichs inequality (for 
simplicity, we state it here in the Hilbert frame): 


Theorem 2 Let Q be bounded (at least in one 
direction of the space). Then there exists a positive 
constant Cp(Q) such that 


llr, co) = Cel Vv cay 
for all v € Ho (0) [4] 


The Whole-Space Case: Riesz and 
Bessel Potentials 


The Riesz potentials Z, naturally occur when one 
defines the formal powers of the Laplace operator A. 
Namely, if f € SIR") and a > 0, then 


FAPA O EFE 


This can be taken formally as a definition of the 
Riesz potential Z, on S’(R), 


Loft.) =F eC) 


for anya € R. If 0 < a < N, then Iaf (x) = (Ie * f )(x), 
where Ia is the inverse Fourier transform of |£| ", 


I4(x) = Co|x|^ ^" 
C, =T((N —a)/2) (1 ?2^T (0/2) à 


where [ is the Gamma function and 1, is the Riesz 
kernel. The following formula is also true: 


d 2, dt 
L(x)eC,] Bee 
0 " 


Recall that every f € S(R) can be represented as 
the Riesz potential Z,g of a suitable function g € 
S(RA), namely g=(—A)*/*f; we get the representa- 
tion formula 


f(x) = Z.g(x) 
at 80) 4 
- JRN |x — ye 


The standard density argument implies then an 
appropriate statement for functions in W^^(RN) 
with an integer k and for the Bessel potential spaces 
H*"(RN) - see below for their definition. The 
original Sobolev imbedding theorem comes from 
the combination of this representation and the basic 
continuity property of I,,ap « N, 


t 


1 
Ig: LP(RN) 2 LURS) -—— 
i L (R ) (R^) 3*3 


Z|® 


To get an isomorphic representation of a Bessel 
potential space (of a Sobolev space with positive 
integer smoothness in particular) it is more convenient 
to consider the Bessel potentials (of order a € R), 


Gaf (x) = (Ga * f)(x) 
= £^ (11 + EPTFE ) E) 


(with a slight abuse of the notations); the following 
formula for the Bessel kernel G,, is well known: 


Cale) =e | (ae-N)/2 - Gh! /0- (1/40) dt 
0 t 

(cf. the analogous formula for Ia), where c= 

(4r) T(a/2). The kernels Ga, can alternatively be 

expressed with help of Bessel or Macdonald functions. 

Now we can define the Bessel potential spaces. 

For s € R and 1 < p < oc, let HP (R^) be the space 
of all f € S'(RN) with the finite norm 


If 


HS? (RN ) 


" u Fas Pf)" ae) | 
JR! 
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In other words, the spaces H® (R) are isomorphic 
copies of L? (R^). 

For k=0,1,2,..., plainly HŽ? (RN) = W*2(R) 
by virtue of the Plancherel theorem. But it is true 
also for integer s and general 1 < p < oo (see, e.g., 
Triebel (1978)). 


Remark 3 Much more comprehensive theory of 
general Besov and Lizorkin-Triebel spaces in R^ has 
been established in the last decades, relying on the the 
Littllewood-Paley theory. Spaces on opens can be 
defined as restrictions of functions in the corresponding 
space on the whole R, allowing to derive their 
properties from those valid for functions on R. The 
justification for that are extension theorems. In parti- 
cular, there exists a universal extension operator for the 
Lipschitz open, working for all the spaces mentioned up 
to now. We refer to Triebel (1978, 2001). 


Unbounded Opens and Weighted Spaces 


The study of the elliptic problems in unbounded 
opens is usually carried out with use of suitable 
Sobolev weighted space. The Poisson equation 


—Au-f inRM, N22 [5] 


is the typical example; the Poincaré inequality [4] is 
not true here and it is suitable to introduce Sobolev 
spaces with weights. 

Let mcN,1«p«oooacR,k-m- N/p—a 
if N/p -- o € (1,...,m] and k— —1 elsewhere. For 
an open Q c RP, we define 


W»^(Q) = fv e D'(9),0 < || < k, 
p^ "-" (log p) ! D^u € L’ (Q), 
kR+1<|A| <m, 
pt" pA, € L^(Q)) 


where p(x)=(1+ Ix|^)!72. Note that W"^ is a 
reflexive Banach space for the norm ||.|| wm defined by 


a-— ( —lpnA 
lle] oo = n || o? "log p) D ullis 
0<[A<k 


k+l1<|A|<m 


| 


We also introduce the following seminorm: 


l/p 
anà 
ne ( X | p° D 1 


[A| m 


Wr? (2) = (v € WP: yo(v) =- m1(v) = 0) 
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If Q is a Lipschitz domain, then wr? (Q) is 
the closure of D(Q) in W"^P(O), while D(Q) is 
dense in W™P (Q). We denote by W=”"?' (N) the dual 


of Ws? (N) (p'—p/(p— 1). We note that these 
spaces also contain polynomials, 


P; c WF” (9) 


j= Im —* -a if À patZ 
"P p p 


N 
j=m— $ — a elsewhere 

where [s] is the integer part of s and Pj, = {0} if 
[s] <0. The fundamental property of functions 
belonging to these spaces is that they satisfy the 
Poincaré weighted inequality. An open €) is an 
exterior domain if it is the complement of a closure 
of a bounded domain in RN. 


Theorem 4 Suppose that Q is an exterior domain 
or Q=R] or Q— RN. Then 


(1) the seminorm | Iymot is a norm on Wn» (Q) / Pi, 
equivalent to the quotient norm with 
j = min (m 一 1,j); 

(ii) the seminorm |-|w»»o, is equivalent to the full 
norm on W?^P(Q). 


Variational Approach 


Let us first describe the method on the model problem 
[1]-[2], supposing f € L^(Q) and € bounded. We first 
suppose that this problem admits a sufficiently smooth 
function u. Let v be any arbitrary (smooth) function; 
we multiply eqn [1] by v(x) and integrate with respect 
to x over €); this gives 


| _(Auv)(x)dx = | (fv) (xe) dx 
0 JQ 


Using the following Green's formula (do(x) denotes 
the measure on T = 0€ and Ou(x)/On = Vu(x) - n(x), 
where n(x) is the unit normal at point x of T 
oriented towards the exterior of Q): 


Í (Asaph oes s — / (Viu - Tv) (xd 
| 


Q 


+ 1 (Zv) (a)do [3 


we get, since v|.—0:.A(u,v)— L(v), where we 
have set 


A(u,v) — | Vu(x) - Vv(x)dx 
: [7] 
L(v) = | f(x)v(x)dx 


The idea is to study in fact this new problem 
(showing first its equivalence with the boundary- 
value problem), noting that it makes sense for far 
less regular functions z, v (and also f), in fact u,v € 
H((Q) (and f € H^! (Q)). 


The Lax-Milgram Theorem 
The general form of a variational problem is 


to find u € V such that 
A(u,v) = L(v) for all v € V [8] 


where V is a Hilbert space, .A a bilinear continuous 
form defined on V x V and L a linear continuous form 
defined on V. We say, moreover, that A is V-elliptic if 
there exists a positive constant o such that 


A(u,u) > allully, for all u € V [9] 
The following theorem is due to Lax and Milgram. 


Theorem 5 Let V be a Hilbert space. We suppose 
that A is a bilinear continuous form on V x V which 
is V-elliptic and that L is a linear continuous form 
on V. Then the variational problem |8] bas a unique 
solution u on V. Moreover, if A is symmetric, u is 
characterized as tbe minimum value on V of the 
quadratic functional E defined by 


for all v € V, E(v) =} A(v, v) — L(v) [10] 


Remark 6 


(i) We have the following “energy estimate": 
luy <+||L||y where V’ is the dual space to V. In 
the particular case of our model problem, this 
inequality shows the continuity of the solution u € 
H((Q) with respect to the data f € L*(Q) (that can 
be weakened by choosing f € H^ (€). 

(ii) Theorem 5 can be extended to sesquilinear 
continuous forms .A defined on V x V; such a form 
is called V-elliptic if there exists a positive constant 
a such that 


Re A(u,u) > alu? for all u € V [11] 


(iii) Denoting by A the linear operator defined on 
the space V by .A(u, v) = (Au, v) y, y, for all v € V, the 
Lax-Milgram theorem shows that A is an isomorph- 
ism from V onto its dual space V', and the problem [8] 
is equivalent to solving the equation Az — L. 

(iv) Let us make some remarks concerning the 
numerical aspects. First, this variational formulation is 
the starting point of the well-known finite element 
method: the idea is to compute a solution of an 
approximate variational problem stated on a finite 
subspace of V (leading to the resolution of a linear 


system), with a precise control of the error with the exact 
solution u. Second, the equivalence with a minimization 
problem allows the use of other numerical algorithms. 


Let us now present some classical examples of 
second-order elliptic problems than can be solved 
with help of the variational theory. 


The Dirichlet Problem for the Poisson Equation 


We consider the problem on a bounded Lipschitz 
open Q c RN, 


—Au=f 


u = up 


[12] 
on = of) 
with uo € H'/2(T), so that there exists Up € H'(Q) 


satisfying ^9(Ug) — 149. The variational formulation 
of problem [12] is 


to find 4 € Up + HÀ (9) such that 
for all v € H}(Q), Alu, v) = L(v) [13] 


with A given by [7] and a more general L with f € 
H ^ (9), defined by 


L(v) — (f, U) n0), H1 (9) [14] 


The existence and uniqueness of a solution of [13] 
follows from Theorem 5 (and Poincaré inequality [4]). 
Conversely, thanks to the density of D(Q) in H}(Q), we 
can show that z satisfies [12]. More precisely, we get: 


Theorem 7 Let us suppose f € H^(Q) and uo € 
H'? (T); let Uo € H! (N) satisfy ^9(Uo) = uo. Then the 
boundary-value problem |12] bas a unique solution u 
such that u — Uo € Hj(Q). This is also the unique 
solution of the variational problem [13]. Moreover, 
there exists a positive constant C= C(Q) such that 


lla < CI + lolly) — [15] 


which shows that u depends continuously on the 
data f and ug. 


Moreover, using techniques of Nirenberg's differ- 
ential quotients, we have the following regularity 
result (see, e.g., Grisvard (1980)): 


Theorem 8 Let us suppose that Q is a bounded 
open subset of R with a boundary of class C^! and 
let f € L^(Q),ug € HY"^(D). Then uc H^(Q) and 
each equation in [12] is satisfied almost everywhere 
(on Q for the first one and on T for the boundary 
condition). Moreover, there exists a positive con- 
stant C= C(Q) such that 


lalla € CIl + lgllpsacr))] [16| 
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By induction, if the data are more regular, that is, 
f € H*(Q) and uo € H**" (T) (with k € N), and if T 
is of class C**^!, we get u € H**?(Q). 


Remark 9 Let us point out the importance of the 
open geometry. For example, if 2 is a bounded 
plane polygon, one can find u € H}(Q) with Au € 
C*(Q), such that ug H!” (Q), where w is the 
biggest value of the interior angles of the polygon. In 
particular, if the polygon is not convex, the solution 
of the Dirichlet problem [12] cannot be in H?(Q). 


The Neumann Problem for the Poisson Equation 


We consider the problem (7 is the unit outer normal 
on I) 


-Au =f inQ 

17 
Ot p on Ll a4 
On 


Setting E(A) — (v € H'(Q); Av € L*(Q)), the space 
D(Q) is a dense subspace, and we have the following 
Green formula for all u € E(A) and v € H!(Q): 


| Au(x)v(x)dx 
Q 


Ou 


=- f Vula): venas ( aw) 
Q On H-M2(D),HV2(T) 


If u € H!(Q) satisfies [17] with f € L^(Q) and he 
H V/*(T), then for any function v € H'(Q), we have, 
by virtue of the above Green formula, 


A(u,v) = L(v) 
Lv= /aeodx 十 (h, YoU) pvo py nar) 


But, here the form A is not H!(Q)-elliptic; in fact, 
one can check that, if problem [17] has a solution, 
then we have necessarily (take v= 1 above) 


[fe dx "T (h, 1) u-iuz (p nar) = [18] 


Moreover, we note that if 4 is a solution, then 
u + C, where C is an arbitrary constant, is also a 
solution. So the variational problem is not well 
posed on H'(Q). It can, however, be solved in the 
quotient space H'!(Q)/R, which is a Hilbert space 
for the quotient norm 


lua = inf lv + Ril etc) [19] 


but also for the seminorm v= [v| = VA(v, v), 
which is an equivalent norm on this quotient space 
(see Necas (1967)). 
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Then, supposing that the data f and ^ satisfy the 
"compatibility condition" [18], we can apply the 
Lax-Milgram theorem to the variational problem 


to find ù € V such that 
A(u, v) = L(v) for all ù € V [20] 


with V = H!(Q)/R. We get the following result (see, 
e.g., Necas (1967): 


Theorem 10 Let us suppose tbat €) is connected 
and that the data f € L*(Q) and b € H-!" (T) satisfy 
[18]. Then the variational problem |20] has a unique 
solution à in the space H'(Q)/R and this solution is 
continuous with respect to the data, that is, there 
exists a positive constant C= C(Q) such that 


ulmo) S C(If listo, 十 lhllu-uaq;) 
for all u € ù 


Moreover, if T is of class C>! and if the data 
satisfy f € L^(Q),g € H?(T), then every u € à is 
such that u € H?((1) and it satisfies each equation in 
[17] almost everywhere. 


Problem with Mixed Boundary Conditions 


Here we consider more general boundary condi- 
tions: the Dirichlet conditions on a closed subset T4 
of T= ðN, and the Neumann, or more generally the 
“Robin”, conditions on the other part I'; =I — T4. 


We seek u such that (f € L7(Q), b € L^(T5), 
a € L*(T3)) 
—Au=f inQ 
4—0 onl, 21) 
au ns =h onl 
On 
Let V —(v € H'(Q);yv=0 on T4). Then [8] is the 


variational formulation of this problem with 


: (u, í no x)- Vv(x) dx + Jr, (ayouyov)(a)do; 
= Jof(x)u(x)dx + Jp, | (byov)(o)do. 


mca for socis a 20, we get a unique 
solution z € V for this variational problem by virtue 
of the Lax-Milgram theorem. Moreover, if u € 
H? (Q), then u is the unique solution in H?(Q) n V 
of the problem [21]. 


The Newton Problem for More General Operators 


Let Q be a bounded open subset of R”. We now 
consider more general second-order operators of the 
form ve —V.(MVv)-g- b. Vv--cv, where be 


[WEI QA, cEL™(Q), M is an NxN square 
matrix with entries Mj, and V - (MVv) stands for 


N « 

yes a, 
— Ox; Ox; 
ij=1 


We also assume that there is a positive constant ay 
such that 


N 


N 
» Mj(x)&;£j > am >. & 
iJ i=1 


J=1 


for a.e. x € Q and € = (&,..., £N) € R^ 


For given data f € L^(Q), b € L?(T), we look for a 
solution u of the problem 

-V -(MVu)+b-Vu+cu=f 

au 4- n: (MVu) 


in €) 


22. 
=h onl a 


We assume that a € L*(T). The variational formu- 
lation of this problem is still [8], with V — H!(Q) 
and 


A(u,v) af MVu - Vvdx 
Q 


+f [b - Vu + cu]vdx + | ayouyovdo [23] 
r 


L(v) — | few 


If the conditions 


dx+ | (byov)(o) do [24 


c—}V. b>C0>0 ae. on? 


a+sb-v>C >O ae. onl 


are fulfilled, with (Co, C1) Æ (0,0), then the bilinear 
form A is V-elliptic and the Lax-Milgram theorem 
applies. 


A Biharmonic Problem 


We consider the Dirichlet problem for the operator 
of fourth order: (c € L®(Q)): 


A^u--cu —f inQ [2.5] 


Ou 
i On | 6 


Theorem 11 Let us suppose that © bas a boundary 
of class C'' and that the data satisfy f € H” (N), uo € 
H?"T), b € HV^(T). Let Up € H^(Q) be such that 
*'yo(Ug) 2 uo, ^1 (Ug) — b. Then, if c > 0 a.e. in Q, the 
boundary value problem |25]-|26] bas a unique 


solution u such that u — Uo € Hé(Q), and u is also the 
unique solution of the variational problem 


to find u € Uo + H;(Q) such that 
A(u,v) = I(v) for all v € H&(Q) [27] 


where l(v) = (f v) 14-29), ao) and 


A(u, v) = | Au(x)Av(x)de+ [ (cuv)(x)dx [28] 
Q Q 
Moreover, there exists a positive constant C= C(Q) 
such that 


lull € € Ulf la-m + ltollusaq 
+ llh llr] [29] 


which shows that u depends continuously upon the 
data f, uo, and h. 


Remark 12 The Hilbert space choice V is of crucial 
importance for the V-ellipticity. In fact, let us 
consider for example the problem [25], with 


Au = 0 on T, CAR 20 ont [30] 
On 
In fact, the associated bilinear form is not V-elliptic 
for V = H? (Q) but it is V-elliptic for V = (v € L? (Q); 
Av € LAJ). 


General Elliptic Problems 


Here €) will be a bounded and sufficiently regular 
open subset of R. Let us consider a general linear 
differential operator of the form 


A(x,D)u = 》 'a,(x)D"u, a,(x) eC [31] 


|u| i 


Setting Ao(x,£) 2 J iu ,a,(x)&", we say that the 
operator A is elliptic at a point x if Ao(x,£) 4 0 for 
all £ € RN — (0). One can show that, if N > 3, l is 
even, that is, / = 2m; the same result holds for N — 2 
if the coefficients a, are real. Moreover, for N > 3, 
every elliptic operator is properly elliptic, in the 
following sense: for any independent vectors €, £' in 
R^. the polynomial T+ Ao(.,€ + TE’) has m roots 
with positive imaginary part. 

The aim here is to study boundary-value problems 
of the following type: 


Au=f mQ [32] 


Ba=g, on T, J=- amm l, [33] 


where A is properly elliptic on Q, with sufficiently 
regular coefficients, and the operators B; are bound- 
ary operators, of order m; < 2; — 1, that must 
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satisfy some compatibility conditions with respect to 
the operator A (see Renardy and Rogers (1992) for 
details; these conditions were introduced by Agmon, 
Douglis, and Nirenberg). For example, A — (— 1)" A" 
and B; — Ó/ /0n is a convenient choice. 

In order to show that problem [32]-[33] has a 
solution 4 € H?"**(Q) (r € N), the idea is to show 
that the operator P defined by um Pl(u)= 
(Au,Bou,...,B,, 14) is an index operator from 
HQ) into G=H"(Q) x Icd Hemtr m1/2(T) 
and to express the compatibility conditions through 
the adjoint problem. 

We recall that a linear continuous operator P is 
an index operator if 


(1) dim Ker P < oo, and Im P closed; 
(2) codim Im P < oc. 


Then the index x(P) is given by y(P)= 
dim Ker P — codimIm P. We recall the following 
Peetre's theorem: 


Theorem 13 Let E, F, and G be three reflexive 
Banach spaces such that E —-— F, and P a linear 
continuous operator from E to G. Then condition 
(1) is equivalent to: “there exists C > 0, such that 
for all u € E, we have |lull: € C(||Pul|c + |ul||g)." 


Applying this theorem to our problem [32]-[33], 
condition (1) results from a priori estimates of the 
following type: 


Mellon; < C (IIPullc + lllier) 


and condition (2) by similar a priori estimates for 
the dual problem. 


Second-Order Elliptic Problems 


We consider a second-order differential operator of 
the *divergence form" 


N N 


Au 一 一 (ae 十 >》 b'(x)ux, +c(xju [34] 


ij=1 i=1 


with given coefficient functions a", b',c(i,j — 
1,...,N), and where we have used the notation 
ux, = $e. Such operators are said uniformly strongly 


elliptic in € if there exists a > 0 such that 


S. a'(x)é olt for allxeQ, £e RN 


ii-|7-1 


Remark 14 There exist elliptic problems for which 
the associated variational problem does not necessa- 
rily satisfy the ellipticity condition. Let us consider 
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the following example, due to Seeley: let Q = {(r, 0) € 
(7, 27) x [0, 27]) and 


pOY a 9? 
x bul) 7 à . 2 
A= (s x) e (1 T 2) 


One can check that, for all A € C, the problem Az + 
Au — f in Q and u =0 on T admits nonzero solutions u 
which are given by (with j such that p? =A) 
u = sinrcos (ue ^) and u = sinr sin (we) for \ Æ 0; 
u= sinr and u = singe for \=0. 


Most of the results concerning existence, unicity, 
and regularity for second-order elliptic problems can 
be established thanks to a maximum principle. 
There exist different types of maximum principles, 
which we now present. 


Maximum Principle 


Theorem 15 (Weak maximum principle). Let A be 
a uniformly strongly elliptic operator of the form 
[34] in a bounded open Q C RY, with al, b, c € 
L*(Q) and c > 0. Let u € C(Q) C(Q) and 


Au > 0 [resp. Au <0] in 
Then 
NETA l e "n 
inf u > infu lresp ee < zum | 
where ut = max(u,0) and u^ = —min(u,0). If c=0 


in Q, one can replace w^ |resp. u^] by u. 


Theorem 16 (Strong principle maximum). Under 
the assumptions of the above theorem, if u is not a 
constant function in C?(Q) n C(Q) such that Au > 0 
[resp. Au < 0], then infgu < u (x) [resp. supo u > 
u(x)], for all x € Q. 


Remark 17 These two maximum principles can be 
adapted to elliptic operators in nondivergence form, 
that is, 


N N 
Au = = y a" (x tts, - >, b'(x)ux + c(x)u [35] 
il 


1-1 


Fredholm Alternative 


We now present some existence results which are 
based on the Fredholm alternative rather than on the 
variational method. 

Let us consider two Hilbert spaces V and H, 
where V is a dense subspace of H and V=- H. 
Denoting by V' the dual space of V, and identifying 
H with its dual space, we have the following 
imbeddings: V — H — V'. Let .A be a sesquilinear 


form on V x V, V-coercive with respect to H, that 
is, there exist Ay € R and o > 0 such that 


Re(.A(v, v)) + Aollv|7; > o||v||y for all v € V 


Denoting by A the operator associated with the 
bilinear form A (see Remark 6(i1)), the equation 
Au — f is equivalent to u — AoTu =g, with T —(A + 
Aold) ' and g — Tf. Note that T is an isomorphism 
from H onto D(A) = {u € H; Au € H}). 

The operator T : H — H is compact and, thanks to 
the Fredholm alternative, there are two situations: 


1. either Ker A — 0 and A is an isomorphism from 
D(A) onto H; 

2. or Ker A Z 0; then Ker A is of finite dimension, 
and the problem Au=f with f € H admits a 
solution if and only if f € Im A = [Ker(A*)]. 


We now give another example in a non-Hilbertian 
frame. Let us consider the problem (Grisvard 1980): 
Au=f in Q and Bu —g on T, where T is of class 
C^!. A, which is defined by [34], is uniformly 
strongly elliptic with a’ =a" € C9! (Q), b/, c € L*(Q), 
and Bu--^o(u) or Bu=7,(u). One can show that 
the operator u+> (Au, Bu) is a Fredholm operator of 
index zero from W2^(Q) in L^(Q) x W?-d-V/P.P(T) 
(with d —0 if Bu —^o(u) and d —1 if Bu —(u)). 


Regularity 


Assume that 2 is a bounded open. Suppose that u € 
H}(Q) is a weak solution of the equation 


Au=f inQ 


z=0 on T 


[36] 


where A has the divergence form [34]. We now 
address the question whether z is in fact smooth: 
this is the regularity problem for weak solutions. 


Theorem 18 (H?-regularity). Let Q be open, of 
class Ch! ai € CHAJ, bi, c e L*(Q),f e L*(Q). Sup- 
pose, furthermore, that u € H'(Q) is a weak solution 
of [36]. Then u € H?(Q) and we have the estimate 


tell recy < Cillo + lello) 


where the constant C depends only on €) and on the 
coefficients of A. 


Theorem 19 (Higher regularity). Let m be a non- 
negative integer, be open, of class C"* *! and assume 
that at € C"**(Q), bi, c e C" (Q0), f e H"(Q). Sup- 
pose, furthermore, that u € H'(Q) is a weak solution of 
[36]. Then u € H"*?(Q) and 


||| am 2 (0) < CI lm coy + [Iu 12(0)) 


where the constant C depends only on Q and on 
the coefficients of A. In particular, if m > N/2, then 
u € C^(Q). Moreover, if € is of C* class and f € 
C**(Q), a) € C**(Q), b, c € C*(Q), then u € C*(Q). 


Remark 20 


(i) If u € Hj(Q) is the unique solution of [36], one 
can omit the L?-norm of u in the right-hand side 
of the above estimate. 

(ii) Moreover, let us suppose the coefficients a", b' 
and c are all C* and f € C*(Q); then, if u € 
H'(Q) satisfies Au —f,u € C*(Q); this is due to 
the “hypoellipticity” property satisfied by the 
operator A. 


We have a similar result in the L^ frame (Grisvard 
1980): 


Theorem 21 (W??-regularity). Let Q be open, of 
class Cu aA e CHA), b, ce L*(Q). Suppose, 
furthermore, that b —0,1 «i € N and c » 0 a.e. 
Then for every f € L?(Q) there exists a unique 
solution u € W*?(Q) of [36]. 


Unbounded Open 
The Whole Space 


Note in passing that we shall work with the weighted 
Sobolev spaces W”?(Q) defined in the subsection 
"Unbounded opens and weighted spaces." 


Theorem 22 Tbe following claims bold true: 
(i) Let f € WP (RY) 


condition 


(f; D y- MORN) x Wi? ( RN) -e = 0 if p > N 


satisfy the compatibility 


Then the problem [5] has a solution u€ 
Wi PIRN), which is unique up to an element in 
Pu n/p) and satisfies the estimate 


< Clif lg, 


Moreover, if 1 <p<N, thenu=Exf. 

(ii) If f € LP(RN ^ then the problem |5] has a 
solution u € Wo PIRN), which is unique up to 
an element in P N/p| and if 1 < p < N/2, then 
B= EF. 


lll wo Sip, wy E CIR 


The Calderón-Zygmund inequality 


Ze < C(N, 
Ox;Óx; L^(RN) 


e € D(RA) 


P)lAvll ims 


and Theorem 4 are crucial for establishing Theorem 22. 
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Further, point (i) means that the Riesz potential of 
second order satisfies 
Iz: Wo *(RY)LPh_N/p > Wo” (R®)/Pi_y 
(where the initial space is the orthogonal comple- 
ment of Pg w;p| in Ww, P(RN) and it is an 
isomorphism. 
Note that here 
WP (RN) = (v € L” (RN): Vv e L"(R^)) 
for 1«p«N and 1/p*—1/p —1/N. And for 
1 «r « N/2, we also have the continuity property 


Ll xd 
.TréopjN s TS N [PES oh — 1 
In: L'(R^) — LIR"), i -N 
Remark 23 The problem 
u—Au=f in R" [37] 


is of a completely different nature than the problem 
[5]. The class of function spaces appropriate for the 
problem [37] are the classical Sobolev spaces. With 
the help of the Calderón-Zygmund theory, one can 
prove that if f € L'(R^), then the unique solution of 
[37] belongs to W?^^(R^) and can be represented as 
the Bessel potential of second order (see Stein 
(1970)): u — G * f, where G is the appropriate Bessel 
kernel, that is, G, for which G(£) ~ (1 + J£?) 27. 
Recall that in particular G(x) ~ |x| el for N —3. 
In the Hilbert case, f € L?(R), we get 


(1+ |é|’)@ € L^(R") 


which, by Plancherel's theorem, implies that 4€ 
H?(R^). For f € W™®P(R^), the problem [37] has a 
unique solution ue Wb'P(RN) satisfying the 
estimate 


||| wo (RN) < C(p, mf \l w- ip (RN) 


Exterior Domain 


We consider the problem in an exterior domain with 
the Dirichlet boundary condition 


Au=f inQ | 38] 
u=g onl —O0Q0 
where f € NW PIO) and g € W!-'».P(9€). Invoking 
the results for RN and bounded domains, one can 
prove the existence of a solution u € Wy E Rot which 
is unique up to an element of the kernel AP (82) ={z € 
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wy? (Q); Az =0} provided that f satisfies the com- 
patibility condition 


(f, i) = c y for all y € AP (Q) 


The kernel can be characterized in the following way: 
it is reduced to {0} if p=2 or p < N and if not, then 


A5(Q) ={C(A-1);CER} ifp>N>3 


where is (unique) solution in W57(Q) N wo” (Q) of 
the problem AA — 0 in Q and A=1 on OQ, and 


A (Q) ={C(u—uo); CER} ifp>N=2 


where uo(x) =(2a|T'|)* Jp log |y — x| doy and p is the 
only solution in WEO) N Wy? (Q) of the problem 
Ap=0 in Q and p=uo on T. 


Remark 24 Similar results exist for the Neumann 
problem in an exterior domain (see Amrouche et al. 
(1997)). The framework of the spaces W”"? (RY ) for 
the Dirichlet problem in RY was also considered in 
the literature. For a more general theory see Kozlov 
and Maz’ya (1999). 


Elliptic Systems 
The Stokes System 


The Stokes problem is a classical example in the 
fluid mechanics. This system models the slow 
motion with the field of the velocity u and the 
pressure 7, satisfying 


—vAu+Vr=f m Q 
(S) divu=h in Q 
u=g on = 


where v > 0 denotes the viscosity, f is an exterior force, 
g is the velocity of the fluid on the domain boundary, 
and b measures the compressibility of the fluids (if 
h = 0, it is an incompressible fluid). The functions b and 
g must satisfy the compatibility condition 


] bends = f g-ndo [39 


Theorem 25 Let €? be a Lipschitz bounded domain 
in RN, N > 2. Let f € HQ)", b € L*(Q), and g € 
HVP(T)N satisfy [39]. Then the problem (S) has a 
unique solution (u,7) € H'(Q)N x L^(Q)/R. satisfy- 
ing tbe a priori estimate 


lullen + Mdh rotos 
= C(If llao + ||% 


Lu) t llus) 


In order to prove Theorem 25, one can start with a 
homogeneous problem. The procedure of finding u is a 
simple application of the Lax-Milgram theorem. 
Application of de Rham's theorem gives the pressure 7. 
We introduce the space 


V= (v € D(Q)"; divv = 0] 
and define F € H^*(Q)N by 


(F, v) —0 forallv cy 


H-ix Hi! 
Moreover, there exists 7 € L^(Q), unique up to an 
additive constant, and such that F= Vs. The 
problem (S), which we transform to the homoge- 
neous case (hb —0,g —0), can be formulated on an 
abstract level. Let X and M be two real Hilbert 
spaces and consider the following variational pro- 
blem: Given L € X’ and X € M', find (u, r) € X x M 
such that 


A(u, v) + B|v, rt| = L(v), 
Blu, q] = X(q), 


where the bilinear forms A, B and the linear form L 
are defined by 


ve xX 


eed [40] 


A(u.v) = | Vu: Vv 
Q 


Biv, q] =- [vv 


Lo) - | fov 


Theorem 26 If the bilinear form A is coercive in 
the space 


V—íveX;B|v,q]—-0) forall q c M 
that is, if there exists o > 0 such that 
A(v,v) »o|v|5, ve V 


then the problem |40] bas a unique solution (u, 7) 
if and only if the bilinear form B satisfies the 
"inf-sup" condition: 


there exists 3 > 0 such that 


inf sup BIA q) 
q€M vex [lvllxllallu 


As for the Dirichlet problem, the regularity result 
is the following: 


Theorem 27 Let Q be a bounded domain in RN, of 
the class C"* ^ if m € N and C^! if m= —1. Let f € 
wm? (Q)N. p € Wt bb) and gc Wm2-Y/ pp (pN 
satisfy condition |39]. Then the problem (S) has a 
unique solution (u, 7) € Wm (0) x Wt PQ) /R. 


2 


Remark 28 It is possible to solve (S) under weaker 
assumption, for instance, if f € W~!/?(Q'), b —0 and 
gc W-VPP(T)N. We can prove that then (1,7) € 
L'(Q)" x WPN). 

The Linearized Elasticity 


The equations governing the displacement u= 
(u1,4u5,u3) of a three-dimensional structure 
subjected to an external force field f are written as 
(Q is a bounded open subset of R? and T — 0Q) 


—uAu — (À+ u)VW(V-u)-—f inQ 
u —0 only 


3 
3 Log (u)v, =g; on I1 =I — I 


where A > 0 and p > 0 are two material character- 
istic constants, called the Lamé coefficients, and 


(v — (v1, v2, v3)) 


gi (v) = ido 


一 = Aój Y Epp (U 


with e;(v) = ej(v = 1 (ðv; + Ov;) 


) T 2uej;(v v) [41] 


where 6; denotes the Kronecker symbol, that is, 
ô; = 1, for i=j and 6; — 0, for i Z j. These equations 
describe the equilibrium of an elastic homogeneous 
isotropic body that cannot move along To; along Ti, 
surface forces of density g = (g1, 22, g3) are given. The 
case l; — 0 physically corresponds to clamped struc- 
tures. The matrix with entries £;(u) is the linearized 
strain tensor while o;(u) represents the linearized 
stress tensor; the relationship [41] between these 
tensors is known as Hooke's law. We refer for 
example to Ciarlet and Lions (1991) and Nečas and 
Hlavácek (1981) (and references therein) for most of 
the results stated in this paragraph. The variational 
formulation of this problem is 


to find u € V such that 


A(u,v) — L(v) for all v € V 42] 


where the bilinear form A and the linear form L are 
given by 


A(u, v) = [ v tv +v) 
XE 


2n V 7 ei(u)ei (o) (ds. 


i j=l 


[43a] 


0) = 人 foo aide + | glo c):v(c)de [43b] 


The functional space V is defined as 


Elliptic Differential Equations: Linear Theory 227 


V = (v = (vi. v2, v3) € [H (Q); 
von = 0 on To, 1 <i <3} 
To prove the ellipticity of A, one needs the following 
Korn inequality: There exists a positive constant C(Q) 
such that, for all v= (v1, v2, v3) € [H' (Q)?, we have 


2 


1/ 
lellig < C(R | Stee vlt: o + lvls » [44] 


The following result holds true: 


Theorem 29 Let 2 be a bounded open in R? with a 
Lipschitz boundary, and let To be a measurable 
subset of T, whose measure (with respect to the 
surface measure dT (x)) is positive. Then the mapping 


n= Se ei (v )llr« (Q) 


is a norm on V, equivalent to the usual norm ||.||, o. 


1/2. 


As a consequence, we get: 


Theorem 30 Under the above assumptions, there 
exists a unique u € V solving the variational 
problem |42]-[|43]. This solution is also the unique 
one which minimizes the energy functional 


"i f 
-rar 


over the space V. 


: 
V-v) pia, oh (x) dx 


$.j=] 


x) dx + /sc o) : vlo) de 
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Introduction 


Entanglement is a type of correlation between 
subsystems, which cannot be explained by the action 
of a classical random generator. It is a key notion 
of quantum information theory and corresponds 
closely to the possibility of channels which transmit 
quantum information, and cannot be simulated by 
classical channels. In this article, we consider the 
development of the concept, and its qualitative 
aspects. The quantitative aspects are treated in a 
separate article (see Entanglement Measures). 


Historical Development 


The first realization that quantum mechanics comes 
with new, and perhaps rather strange, correlations 
came in the famous 1935 paper by Einstein, Podolsky, 
and Rosen (EPR) (Einstein et al. 1935), in which they 
set up a paradox showing that the statistics of certain 
quantum states could not be realized by assigning 
wave functions to subsystems. It was in response to 
this paper that Schrédinger (1935), in the same year, 
coined the term “entanglement,” as well as its German 
equivalent “Verschrankung.” The subject lay dormant 
for a long time, since Bohr, in his reply, completely 
ignored the entanglement theme, and there was a 
widespread reluctance in the physics community to 
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consider problems of interpretation. The leaf turned 
slowly with Bohm’s reduced model of the EPR 
paradox using spins rather than continuous variables, 
and decisively with Bell’s 1964 strengthening of the 
paradox (Bell 1964). He showed that not only wave 
functions assigned to individual systems failed to 
describe the correlations predicted by quantum 
mechanics, but any set of classical parameters assigned 
to the subsystems. This eliminated all reference to a 
possibly dubious quantum ontology and all reference 
to the quantum formalism from the argument. Bell 
derived a set of inequalities from the assumption that 
each subsystem could be described in terms of classical 
variables, and that these (possibly hidden) variables 
would not be changed by the mere choice of a 
measurement for the distant correlated system. The 
only relation to quantum mechanics was then the 
simple quantum calculation showing, in certain situa- 
tions, such as the state described by EPR, quantum 
mechanics predicted a violation of Bell’s inequalities. 
This immediately suggested an experiment, and 
although it was difficult at first to find an efficient 
source of suitably quantum-correlated pairs of parti- 
cles, the experiments that have been made since 
then have supported the quantum-mechanical result 
beyond reasonable doubt. This came too late for 
Einstein, whose research program in quantum 
mechanics had been precisely to build a “local 
hidden-variable theory” of the type seen in contra- 
diction with Bell’s inequality. But at least the EPR 
paper had finally received the response it deserved. 

In Schródinger's work, entanglement was a purely 
qualitative term for the strange way the subsystems 
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seemed to be intertwined as soon as one insisted on 
discussing their individual properties. After Bell’s 
work, the favored mathematical definition of entan- 
glement would probably have been the existence of 
measurements on the subsystems, such that Bell’s 
inequality (or some generalization derived on the same 
assumptions) is violated. However, around 1983 
another notion of (the lack of) entanglement was 
independently proposed by Primas (1983) and Werner 
(1983). According to this definition, a quantum state p 
is called unentangled if it can be written as 


p= N pap, © pa [1] 


where the p‘; are arbitrary states of the subsystems 
(i— 1,2), which depend on a “hidden variable” a, 
drawn by a classical random generator with prob- 
abilities py. Such states are now called separable, 
which is a bit awkward, since the notion is typically 
applied to systems which are widely separated. 
However, the term is so firmly established that it is 
hopeless to try to improve on it. 

In any case, it was shown by Werner (1989) that 
there are nonseparable states, which nevertheless 
satisfy Bell’s inequalities and all its generalizations. 
The next step was the observation by Popescu 
(1994) that entanglement could be distilled: this is 
a process by which some number of moderately 
entangled pair states is converted to a smaller 
number of highly entangled states, using only local 
quantum operations, and classical communication 
between the parties. For some time it seemed that 
this might close the gap, that is, that the failure of 
separability might be equivalent to “distillability” 
(i.e., the existence of a distillation procedure produ- 
cing arbitrarily highly entangled states from many 
copies of the given one). However, this turned out to 
be false, as shown by the Horodecki family in 1998 
(Horodecki et al. 1998), by explicitly exhibiting 
bound entangled, that is, nonseparable, but also not 
distillable states. In 2003 - Oppenheim and the 
Horodeckis introduced a further distinction, namely 
whether it is possible to extract a secret key from 
copies of a given quantum state by local quantum 
operations and public classical communication 
(Horodecki et al. 2005). This task had hitherto 
been viewed as an application of entanglement 
distillation, but it turned out that secret key can be 
distilled from some bound entangled (but never from 
separable) states. 

For the entanglement theory of multipartite states, 
that is, states on systems composed of three or more 
parts, between which no quantum interaction takes 
place, one key observation is that new entanglement 
properties must be expected with any increase of the 


number of parties. As shown by Bennett et al. 
(1999), there are states of three parties which cannot 
be written in the three-party analog of [1], but are 
nevertheless separable for all three splits of the 
system into one vs. two subsystems. 

The crucial advance of entanglement theory, 
however, lies not so much in the distinctions 
outlined above, but in the quantitative turn of the 
theory. With the discovery of the teleportation and 
dense coding processes (Bennett and Wiesner 1992, 
Bénnett et al. 1993), entanglement changed its role 
from a property of counterintuitive contortedness to 
a resource, which is used up in teleportation and 
similar processes. Distillation is then seen as a 
method to upgrade a given source to a new source 
of highly entangled states suitable for this purpose, 
and it is not just the possibility of doing this, but the 
rate of this conversion, which becomes the focus of 
the investigation. All the tasks in which entangle- 
ment appears suggest quantitative measures of 
entanglement. In addition, there are many entangle- 
ment measures, which appear natural from a 
mathematical point of view, or are introduced 
simply because they can be estimated relatively 
easily and in turn give bounds on other entangle- 
ment measures of interest. The current situation is 
that there is no shortage of entanglement measures 
in the literature, but it is not yet clear which ones 
will be of interest in the long run. Some of these 
measures are described in Entanglement Measures. 

The current state of entanglement theory is marked 
firstly by some long-standing open problems in the 
basic bipartite theory on the one hand (additivity of 
the entanglement of formation, the existence of NPT 
bound entangled states, and more recently the 
existence of entangled states with vanishing key 
rate). Secondly, there is significant effort to try to 
compute some of the entanglement measures, at least 
for simple subclasses of states. This is so difficult, 
because many definitions involve an optimization 
over operations on an asymptotically large system. 
Thirdly, there is a new trend in multipartite entangle- 
ment theory, namely looking specifically at entangle- 
ment in lattice structures such as spin systems of 
harmonic-oscillator lattices. Here one can expect very 
fruitful interaction with the statistical mechanics and 
solid-state physics in the near future. 


Qualitative Entanglement Theory 
Setup 


Throughout this section, we will consider density 
operators on a Hilbert space split in some fixed way 
into a tensor product of a Hilbert space Ha for 
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Alice’s system and a Hilbert space Hg for Bob’s 
system, that is, H = 44 ® Hg. For simplicity, we will 
mostly consider finite-dimensional spaces, and if a 
dimension parameter d < oo appears, it is under- 
stood that d — dim H4 = dim Hg. By B(H) we will 
denote the set of bounded operators on a Hilbert 
space, and by B. (71) the set of trace-class operators. 
We distinguish these even in the finite-dimensional 
case, because of their different norms. By G we will 
denote the state space of the combined system, that 
is, the set of positive elements of B,(H) with trace 1. 

For such a density operator p= p*® we denote by 
p^ and pP the restrictions to the subsystems, defined 
by the partial trace over the other system, or by 
tr(p^F) — tr(p^P(F & 1)). We denote by © the opera- 
tion of matrix transposition, and by id $ O the 
partial transposition, applied only to the second 
tensor factor. Since transposition is not completely 
positive (see Channels in Quantum Information 
Theory) partial transposition may take positive 
operators to non-positive operators. The relative 
entropy (see Entropy and Quantitative Transvers- 
ality) of two density operators p,c will be used with 
the convention S(p||o) — tr p( log p — log o). 


Witnesses and the Criterion of Positivity 
of Partial Transpose 


A state p is called separable iff it is of the form [1], 
and entangled otherwise. The set of separable states 
€ is a convex subset of the set G of all states. Its 
extreme points are obvious from the representation 
[1], namely the pure product states p=|¢, @ 
Ópg)(GA & óp|. Since €, like G, is a convex set in 
(d* — 1) dimensions, Caratheodory’s theorem asserts 
that the sum can be taken to be a decomposition 
into d^ such terms. For a given p, deciding whether 
it is separable or entangled, hence, involves a 
nonlinear search problem in roughly 4d? real 
parameters, namely the vector components of the 
A, Op appearing in the sum. 

Dually, the convex set € can be described by a set 
of linear inequalities. Here is a simple way of 
generating such inequalities: let T : B, (71g) — B(?14) 
be a positive linear map, that is, a map taking 
positive matrices to positive matrices. Then for 
PA, pp > 0 the expression tr(pAT(ppg)) is positive. It 
is also bilinear, so we can find a Hermitian operator 


re B(H, & Hp) such that 


tr(paT(pp)) = tr((pa & pg)T*) 


Since the left-hand side is positive, we see by taking 
convex combinations that tr(pT*) > 0 for all separ- 
able states p. Hence, if we find a state with a negative 
expectation of T°, we can be sure it is entangled. 


Therefore, such operators T* are called entanglement 
witnesses. This is often a useful criterion, especially 
when one has some additional information about the 
state, allowing for an intelligent choice of witness. It 
is known from the theory of ordered vector spaces 
and their tensor products that the set of witnesses 
constructed above is complete. Hence, in principle, 
checking all such witnesses provides a necessary and 
sufficient criterion. for entanglement. However, in 
practice this remains a difficult task, because the 
extreme points of the set of positive maps are only 
known for some low dimensions. 

By restricting T to completely positive maps, we 
get a useful necessary criterion. It can be seen that it 
is equivalent to 


(id & O)(p) 20 


that is, to the positivity of the partial transpose 
(PPT). States with this property are called *PPT 
states" in current jargon. 


Pure States, Purification 


For pure states, that is, for the extreme points of G, 
separability is trivial to decide: since for pure states 
the sum [1] can only be a single term, a pure state is 
separable iff it factorizes. 

A useful observation is that, for pure states 
p= |®)(®|, all information about entanglement is 
contained in the spectrum of the reduced states. 
Consider a vector ® € Ha ® Hg of the form 


=X Vrata @ OF [2] 


where $A € Ha and BB € Hpg are orthonormal 
systems, fo > 0, and > £,—1. Then it is easy to 
check that o^ = 37, |) (4| is the spectral resolu- 
tion of the restriction. Conversely, by diagonalizing 
the restriction of a general unit vector ®, we find a 
biorthogonal decomposition of the from [2], also 
known as the Schmidt decomposition. The Schmidt 
spectrum (r;,...,7;] hence classifies vectors up to 
local basis changes in Ha and Hg. 

Since any o^ can appear in this construction, we 
see that any mixed state can be considered as the 
restriction of a pure state, which is essentially 
unique, namely up to the choice of basis in the 
purifying system B, and up to perhaps adding or 
deleting some irrelevant dimensions in Hg. The 
resulting vector is known as the purification of pa. 

The extreme cases of [2] are pure product states 
on the one hand, and vectors, for which o^ = 1/d is 
the totally chaotic state. These are known as 
maximally entangled and embody, in the most 
extreme way, the observation that in quantum 


mechanics, as opposed to classical probability, the 
restriction of a pure state may be mixed. 

Let us fix a maximally entangled vector 2, and 
the matching Schmidt bases, so that 


1 
Q = vi2- Ikk) [3] 


where we have used the simplified ket notation, in 
which only the basis label is written. Then, an 
arbitrary vector can be written as ®=(X® 1) 
Q =(1@X!)Q, where XT denotes the matrix trans- 
pose of X. Clearly, this vector is again maximally 
entangled iff X is unitary. Hence, the set of maximally 
entangled vectors is a single orbit under unilateral 
unitary transformations, and we even have the choice 
to which side we apply the unitaries. 


Teleportation 


Suppose we have an orthonormal basis of maximally 
entangled vectors ®, € Ha ® Hg. By the remarks 
above, this is equivalent to choosing unitaries 
U,,0—1,...,d^ such that 6,=(U,@1)9, and 
tr(U* Ug) — dó,5. For example, a finite Weyl system 
constitutes such a system of unitaries, which shows 
that we can find realizations in any dimension d. 

Suppose that Alice and Bob each own part of a 
system prepared in the state () then they can 
transmit perfectly the state of a d-dimensional 
system, using only classical communication. Classi- 
cal communication by itself would never suffice to 
transmit quantum information, and the entangled 
resource €) by itself does not allow the transmission 
of any signal. But the combination of these resources 
does the trick: Alice measures the observable 
associated with the basis ®, on the combined system 
formed by the unknown input and her part of the 
entangled pair. The result o is then transmitted to 
Bob, who performs a U,-rotation on his part of the 
entangled pair, producing the output state of the 
teleportation. One can show by direct calculation 
that this is exactly equal to the input state. 

Note that the resource Q is destroyed in this 
process, so that for every transmission we need a 
fresh entangled pair. Less than maximally entangled 
states instead of € lead to less-than-perfect transmis- 
sion, which can be extended to quantitative relations 
between entanglement and channel capacity. 


Special Systems 
Qubits 


For qubit pairs, there is a special basis of maximally 
entangled vectors, which has some amazing 
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properties. It consists of the vectors ®y =Q, and 
$,—1(c, Q 1)0, where o,,k=1,2,3, denotes the 
Pauli matrices. Then a vector is maximally entangled 
iff its components are real in this basis, up to a 
common phase. A unitary matrix of determinant 1 
factorizes into Ui & U; iff its matrix elements are 
real, up to a common phase. 

For qubit pairs, and also for dimensions 2 ® 3, the 
partial transposition criterion for entanglement is 
necessary and sufficient, as shown by Woronowicz 
and the Horodecki family. 


Orthogonally Invariant States 


A state p on C? & C" is called orthogonally invariant 
if, for any orthogonal matrix U (with respect to some 
fixed product basis) [p, U & U] — 0. This leaves a 
three-dimensional space of operators, spanned by the 
identity, the permutation F= 5 ;,|j)(|, and its 
partial transpose F= >; ; |ti) (jj|, which is d times the 
projection onto the maximally entangled vector Q. 
Figure 1 shows the plane of Hermitian operators p 
with the described symmetry and tr p= 1. Convenient 
coordinates are tr pF and tr pF. Note that these are 
defined for any density operator, and are also invariant 
under the “twirl” operation p++ f dU(U & U)p(U & 
U)', using the Haar measure dU, which projects onto 
the orthogonally invariant states. Hence, the diagram 
provides a section as well as a projection of the state 
space. The intersection of the positive operators with 
those having positive partial transpose is the set of PPT 
states, which in this case coincides with the separable 
states. The thin lines correspond to states of higher 
symmetry, namely on the one hand the “isotropic 
states” commuting with U @ U, with U the complex 
conjugate of U, and the *Werner states" commuting 
with all unitaries U & U. Their intersection point is the 
normalized trace. 


Figure 1 The plane of orthogonally invariant unit trace 
Hermitian operators of a 3@3-system. The upright triangle 
gives the positive operators, and the dashed one those with 
positive partial transpose. The shaded area gives the PPT 
states. 
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Gaussians 


In general, the entanglement in systems with infinite- 
dimensional Hilbert spaces is more difficult to 
analyze. However, if the system is characterized by 
variables satisfying canonical commutation rela- 
tions, like positions and momenta, or the compo- 
nents of the free quantum electromagnetic field, 
there is a special class of states, which is again 
characterized by low-dimensional matrices. This 
allows the discussion of entanglement questions, in 
a way largely parallel to the finite-dimensional 
theory. 

Let R;,...,R5; denote the canonical operators, 
where f is the number of degrees of freedom. The 
commutation relations can be summarized as 
i[R;, Rv] =o, ,1, where o is the symplectic matrix. 
Operators R, have a common set of analytic 
vectors, and generate the unitary Weyl operators 
W (a) = exp (ia"R,,), which describe the phase space 
displacements. Gaussian states are those making 
a+>trpW(p) a Gaussian function or, equivalently, 
those with Gaussian Wigner function. Up to a gloal 
displacement, they are completely characterized by 
the covariance matrix 


Yuv = tr PUR, Rp T R,R,) [4] 


The only constraint for a real symmetric matrix to 
be a covariance matrix of a quantum state is that 
y+ io is a positive semidefinite matrix, which is a 
version of the uncertainty relations. 

Now for entanglement theory, we take some of 
the degrees of freedom as Alice’s and some as Bob’s. 
Separability can be characterized in terms of y, 
namely by the condition that y > 7, where 7’ is the 
covariance matrix of a Gaussian product state. 
Similarly, partial transposition can be implemented 
as an operation on covariance matrices, which 
allows a simple verification of the PPT condition. 
It turns out that as long as one partner has only a 
single degree of freedom, the PPT condition is 
necessary and sufficient for separability, but this 
fails for larger systems. 

The pure Gaussian states allow a normal form 
with respect to local symplectic transformations 
analogous to the Schmidt decomposition. For the 
minimal case of one degree of freedom on either 
side, one obtains a one-parameter family of 
*two mode squeezed states." Its limit for infinite 
squeezing parameter is the state used by EPR 
(Einstein et al. 1935), which, however, makes 
rigorous mathematical sense only as a singular 
state, that is, a linear functional on B(H), which 
can no longer be represented as the trace with a 
density operator. 


Multipartite Stars 


A key feature of entanglement in a multipartite 
system is usually referred to as “monogamy”: when 
Alice shares a highly entangled state with Bob, her 
system cannot also be highly entangled with Bill. 
More formally, suppose that a multipartite state for 
systems A, B1,..., B, is given, such that the restric- 
tion to each pair AB, is the same bipartite state p. 
Then as n becomes larger, the existence of such a 
star-shaped extension constrains p to become less 
and less entangled. In fact, as m — oo, this condition 
is equivalent to the separability of p. 


Open Problems 


Recall from the introduction the following chain of 
inclusions: 


separable states C states with vanishing key rate 
c PPT state 
C undistillable states 
C all states 


The second and fourth inclusions are strict, but for 
the first and third one might have equality, for all 
we know. Especially for the third inclusion, this is a 
long-standing problem. 

Finally, we would like to point out that qualita- 
tive and conceptual aspects of entanglement are 
surveyed by Bub (2001), Popescu and Rohrlich 
(1998), and Horodecki et al. (2001). For quantita- 
tive aspects see Entanglement Measures. 


See also: Capacities Enhanced by Entanglement; 
Capacity for Quantum Information; Channels in Quantum 
Information Theory; Entanglement Measures; Entropy 
and Quantitative Transversality. 
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Introduction 


Entanglement, or quantum correlation, is one of 
the central concepts in quantum information 
theory. Its theory can be roughly separated into 
three parts. The first is qualitative, that is, it 
addresses the question “Is this state entangled or 
not?" The second, comparative part asks “Is this 
state more entangled than that state?," and finally 
the quantitative theory asks “How entangled is this 
state?,” and gives its answers in the form of 
entanglement measures assigning a number to 
every state. Quantitative questions come up natu- 
rally whenever entanglement is used as a resource 
for tasks of quantum information processing. For 
example, entangled states are in a way the fuel for 
the processes of teleportation and dense coding: in 
each transmission step a maximally entangled pair 
system is required, and cannot be used for a further 
transmission. The process also works with less than 
maximally entangled states, but then it also 
becomes less efficient. Since entangled states 
created in the laboratory typically have imperfec- 
tions, it becomes important to understand the rates 
at which imperfectly entangled states may be 
distilled to maximally entangled ones, and this 
rate is a direct measure of the usefulness of the 
given state for many purposes. The quantitative, 
task related turn is a new development in the study 
of the foundations of quantum mechanics. It has 
been imported from classical information theory, 
where this way of thinking has been standard for a 
long time. The combination makes the particular 
flavor of quantum information theory. 
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In this article we consider the comparative and 
quantitative aspects of entanglement. The historical 
aspects and qualitative theory are treated in a separate 
article (see Entanglement), to which we refer for basic 
notions and notations. The example of teleportation 
suggests close links between quantitative entanglement 
theory and the theory of capacity Bennett et al. (1996), 
which is the transfer rate of quantum information 
through a given channel. These connections are 
described in Quantum Channels: Classical Capacity. 

We follow the notations of the basic article on 
entanglement (see Entanglement). In particular, O 
denotes the transpose operation, and (id $ O) the 
partial transpose. A state is called “PPT” if its 
partial transpose is positive. The two physicists 
operating the laboratories in which the two parts 
of a bipartite system are kept are called Alice and 
Bob, as usual. The restriction of a state p to Alice's 
subsystem is denoted by p^. 


Comparative Entanglement 
and Protocols 


Protocols 


In this section we introduce relations of the kind “state 
pı is more entangled than p2.” We take this to mean 
that p2 can be obtained by applying to pı some 
operations which *cannot create entanglement." The 
definition of a class of operations of which this can be 
claimed then defines the comparison. It turns out that 
there are different choices for the class of such 
operations, depending on the resources available for 
the transformation steps. The class of operations is 
usually referred to as a protocol. 

Certainly local operations performed separately 
by Alice and Bob cannot increase entanglement. 
Alice and Bob might have to make some choices, 
and even if they make these according to a 
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prearranged scheme, by using a shared table of 
random numbers, entanglement will not be gener- 
ated. In this restrictive protocol, which we abbre- 
viate by LO, for “local operations,” no 
communication is allowed. It is clear that by just 
discarding the initial state, and preparing a new one, 
based on the random instruction allows Alice and 
Bob to make any separable state, so these states 
come out as the “least entangled” ones for this and 
any richer protocol. 

Next we might allow classical communication 
from Alice to Bob. That is, Bob’s decision to 
perform some operation in his laboratory is allowed 
to depend on measuring results obtained by Alice in 
an earlier stage. Of course, Alice is not allowed to 
send quantum systems, since in this case she might 
just send a particle entangled to one of her own, and 
any state could be generated. This protocol is 
referred to as “local operations and one-way 
classical communication” (LOWC). Obviously, we 
might also allow Bob to talk back, arriving at “local 
operations and classical communication” (LOCC). 
This is the protocol underlying most of the work in 
entanglement theory. 

The drawback of the LOCC protocol is that its 
operations are extremely difficult to characterize: an 
LOCC operation can take many rounds, and there is 
no way to simplify a general operation to some kind 
of standard form. This is the main reason why other 
protocols have been considered. For example, it is 
obvious that an LOCC operation can be written as a 
sum of tensor products of local operations, in a form 
reminiscent of the definition of separability. How- 
ever, such “separable superoperators” may fall 
outside LOCC. Another property easily checked for 
all LOCC operations is that PPT states go into PPT 
states. The protocol “PPT-preserving operations” 
(PPTP) can also be characterized as the set of 
channels T for which (id ® O)T(id & O) is positive 
(although not necessarily completely positive). This 
condition is relatively easy to handle mathemati- 
cally, so that the best way to show that some pi 
cannot be converted to p? by LOCC is often to show 
that this transition is impossible under PPTP. The 
drawback of the PPTP protocol is that it may create 
some entanglement after all, namely arbitrary PPT 
states. So it properly belongs to a modified 
entanglement theory in which separability is 
replaced by the PPT condition. 


Converting Pure States and Majorization 


The entanglement ordering is exactly known for 
pure states due to a famous theorem by Nielsen 
(1999): a pure state p; is more entangled than a pure 


state p2 under the LOCC protocol iff the restriction 
p? is more mixed than the restriction ^ in the sense 
of majorization of spectra (i.e., for every k the sum 
of the & largest eigenvalues of p? is less than the 
corresponding sum for p$). Equivalently, there is a 
doubly stochastic channel (completely positive linear 
map preserving both the identity and the trace 
functional) taking p> to pj. 

An interesting aspect of this theory is the 
phenomenon of catalysis: It may happen that 
although pı cannot be converted by LOCC to 
p2,p1 & o can be converted to p» & a. The “catalyst” 
g is a resource borrowed at the beginning of the 
transformation, and is returned unchanged after- 
wards. The order relation allowing such catalysts is 
yet to be fully characterized. 


Asymptotic Conversion 


In many applications we are not interested in exact 
conversion of one state to another, but are quite 
satisfied if the transformation can be done with a 
small controlled error. In particular, when we ask 
for the achievable conversion rate between many 
copies of the states involved, we allow small errors, 
but require the errors to go to zero. Given any 
protocol, and states p1,p2, we say that pı can be 
converted to p2 with rate r if, for all sufficiently 
large n, there is a channel of the protocol, which 
takes n copies of pi, that is, the state pf”, to a 
state p' which approximates roughly m ~ rn copies 
of p, in the sense that m > rn, and the trace norm 
|o" — p$” || goes to zero. 

Of course, one is usually interested in the 
supremum of the achievable conversion rates, which 
we call simply the maximal conversion rate. In 
particular, when p2 is the maximally entangled pure 
state of a qubit pair (usually called the *singlet"), the 
maximal rate is called the distillable entanglement 
Ep(p1). In the other direction, when p is the singlet, 
we call the inverse of the maximal conversion rate the 
entanglement cost Ec(p2). These are two of the key 
entanglement measures to be discussed below. 

In general, Ep(p) < Ec(p), so the asymptotic 
conversion between different states is usually not 
reversible. However, this is the case for pure states, 
and one finds 


Ep(p) = Ec(p) = S(p^) [1] 


where S(p)= —tr plog, (p) denotes the von Neu- 
mann entropy (see Entropy and Quantitative Trans- 
versality) based on the binary logarithm. 

Since one can do the conversion between different 
pure states via singlets, it is clear that the maximal 
conversion rate from a pure state pı to a pure state p» 


equals S(pi‘)/S(p}). Hence, in contrast to the ordering 
given by Nielsen’s theorem, all pure states are 
interconvertible, and the ordering is described by a 
single number. For this simplification, the allowance 
of small errors is crucial. Without asymptotically 
small but nonzero errors, it would also be impossible 
to obtain singlets from any generic mixed state. 


Entanglement Measures 
Properties of Interest 


We now consider more systematically functions 
E:S — R defined on the states spaces of arbitrary 
bipartite quantum systems. When can we regard this 
as a measure of entanglement? The minimal require- 
ments are that E(p) > 0 for all p, and E(p)=0 for 
separable states. Since the choice of local bases 
should be irrelevant, we will require E((U, ® 
Up)p(U, & Up)*)=E(p) for unitaries Ua, Up. We 
also normalize all entanglement measures so that 
E(a)=1, when ø is the maximally entangled state of 
a pair of qubits. Beyond that, consider the following: 


1. V (Convexity E( >>, Papa) < $ a paE(pa)) Start- 
ing from any E, possibly defined only on a subset 
containing the pure states, we can enforce this 
property by taking the convex hull (or “roof”) 
coE, defined as the largest convex function, 
which is <E wherever it is defined. 

2. M (Monotonicity) Suppose that some LOCC 
protocol applied to p returns some classical 
parameter a with probability pa, and in that 
case a bipartite state pa. Then $^, paE(pa) € E(p). 

3. A. (Subadditivity E(pi ® p2) € E(pi) + E(p2)) 
In this and the following, the tensor products of 
bipartite states are to be reordered from 
A;B,A2Bz to (A1A2)(B4B5), so the separation 
into Alice’s and Bob’s subsystems is respected. 

4. A’ (Superadditivity E(pı & p2) > E(p1) + E(p2)) 

5. A^ (Strong superadditivity  E(pi2) > E(p1)+ 
E(p2)) Here p; denotes the restriction of a general 
state p12 to the ith subsystem. 

6. A* (Weak additivity E(p®”)=nE(p)) This can 
be enforced by regularization, going from E to 


B (p) lim = Elp”) 
Note that this is implied by additivity, which is 
the conjunction of A^ and A. 

7. C (Continuity) Here it is crucial to postulate the 
right kind of dimensional dependence. A good 
choice is to demand that |E(p;) — E(p2)| € 
log df(lp1 — p2||), where f is some function with 
lim, —o f(t) — 0. 
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8. L (Lockability) A property related to, but not 
equal to, discontinuity: a measure is called 
lockable, if the loss (ie., the tracing out) of a 
single qubit by Alice or Bob can make E(p) drop 
by an arbitrarily large amount. 


The Collection of Entanglement Measures 


The following are the main entanglement measures 
discussed in the literature. Note that all measures 
defined by conversion rates in principle depend on 
the protocol used. Unless otherwise stated, we will 
only consider LOCC. For every function we list in 
brackets the properties which are known. 


1. Eg (Entanglement of formation |V, M, A ,C,L]) 
This is defined as the convex hull of the 
entanglement of pure states given by eqn [1]. 
For qubit pairs, there is a closed formula due 
to Wootters (1998), orthogonally invariant states 
(Vollbrecht and Werner 2001) (see 00510), and 
permutation symmetric 2-mode Gaussians. One 
of the big open questions is whether Ep is 
additive. This is equivalent to Ep satisfying A**, 
and also to the additivity of Holevo's x-capacity 
of quantum channels (see Quantum Channels: 
Classical Capacity). 

2. Ec (Entanglement cost V, M,A ,A*,C,L]) This 
was already defined in the section “Asymptotic 
conversion.” It has been shown to be equal to the 
regularization of Er, that is, Ec = EX. If Ep would 
turn out to be additive, we would thus have Ec = Er. 

3. Ep (Distillable entanglement | |M,A^^,A* ]) 
Again, see the section *Asymptotic conversion." 
This is one of the important measures from the 
practical point of view, but notoriously difficult 
to compute explicitly. Convexity of Ep is an open 
problem related to the existence of bound 
entangled, but not PPT states. 

4. E , (One-way distillable entanglement [M, A^", 
A*]) Same as Ep, but restricting to the LOWC 
protocol. Obviously, E ,(p) € Ep(p). There are 
examples of proper inequality “<” (Bennett et al. 
1996). E , is more directly linked to quantum 
capacity than Ep, which in turn corresponds to the 
quantum capacity, allowing classical backwards 
communication as a resource. 

5. En (Logarithmic negativity M, A ,A",L]) This 
is a quantitative companion of the PPT criterion: one 
sets Ex(p) = log; ||(id & O)(p)||, where the norm is 
the trace norm. For PPT states, p this is equal to the 
trace, and Ex(p) — 0. If the partial transpose has 
negative eigenvalues, the sum of their absolute 
values is >1, and Ew(p)» 0. Ew is an easily 
computed upper bound to Ep, but gives the wrong 
value for nonmaximally entangled pure states. 
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10. 


. Er (Relative 


. Es (Squasbed entanglement 


. Ex (Key rate [V, M,A |) 


entropy | of entanglement 
[V, M,A ]) This measure (Vedral et al. 1997) 
is motivated geometrically: it is simply the 
relative entropy distance of p to the separable 
subset: Eg(p) = inf, S(plla), where o ranges over 
all separable states. Eg is an upper bound to Ep. 
However, it can be improved by taking the 
distance to the PPT states rather than the 
separable states, and by combining with Ex, in 
the following way: 


. Eg (The Rains bound [V, M, A ,C]) Following 


Rains (2001), we set 
Ex(p) = inf(S(pl|o) + Ex(e)) 


where the infimum is over all states o. This is 
still an upper bound to Ep, although clearly 
smaller than both Eg (take only separable c) 
and Ew (take e — p). No example of Ep(p) < 
Eg(p) is known, but any bound entangled non- 
PPT state would be such an example. 

[VMA A’, 
C,L]) This measure, introduced by Christandl 
and Winter (2004), amazingly has all the good 
properties, but is as difficult to compute as any 
of the other measures. Es(p*®) is the infimum 
over the entropy combination 


S(p^*) + S(p®°) — S(p^PC) — S(p°) 


ABC AB to 


over all extensions p^"* of the given state p 
a system enlarged by a part C, where the density 
operators in the above expression are the 
restrictions of p*®© to the subsystems indicated. 
The bit rate at which 
secret key can be generated is certainly larger 
than Ep, since distillation is one way to do it. It 
is, in general, strictly larger, since there are 
undistillable states with positive key rate. 

Ec (Concurrence [V]) This measure was 
originally only defined for qubit pairs, as a 
step in Wootter’s (1998) formula for Ep in this 
case. It has an extension to arbitrary dimensions 
(Rungta et al, 2001), namely the convex hull of 


the function c({w) (|) 2 /2(1 — tr(p*)), where 
p= |w)(p\* is the reduced density operator. 
Both upper and lower bounds exist in the 
literature. The main interest in this measure 
stems from the fact that it has interesting 
extensions to the multipartite case. 


To conclude, we would like to point out that 
many of the themes discussed in this article were set 
by Bennett et al. (1996); their article is worth 
reading even today. Good review articles covering 
entanglement measures, with more complete refer- 
ences, are Plenio and Virmani (2005), Brufs (2002), 
and Donald et al. (2002). 


See also: Entanglement; Entropy and Quantitative 
Transversality; Quantum Channels: Classical Capacity. 
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Introduction 


A mathematical law for a physical phenomenon, 
describing the variation of a value y(c R) in terms 
of parameters x;c€ R,i€(1,...,"7), is usually 
given: 


1. in the simplest cases (and hence in exceptional 
cases), by an explicit functional equation 
d FD 45a), OF 


2. by an implicit equation G(y, x1,...,x,) — 0, or 
3. more generally, by a partial differentiable 
equation, 
gly Oily 
H| y, 一 一 一) 一 一 一 ,X1 --- 5X; 
b OX}, ^ Oxi,. Ox; Ox; i as 


= 0 + initial values 


In the first case, the exact equation y= F(x1,...,x,) 
fully describes the behavior of y as (xi,...,x,) 
vary, but in practice this information is too 
substantive: using the Taylor formula, knowledge 
of the value y" at some point (x°,...,x°) and of the 
value of 


u OF OF 0 0 
VE («0 rd a7 (5) em 


is enough to predict, with controlled accuracy, by 
linear approximation, the behavior of y for para- 
meters (31, «vs x4) Close to (x9,...,xV). 

In the case (2), both the parameters (x1,...,x,) and 
the value y belong to the set M—((y,x1,...,x,) € 
R"*!:G(y,x4,...,x4,) =O}, and we would like to 
know whether or not this set may be (at least locally 
around one of its point (y?,x7,...,x?)) a graph of 
some function (x1,...,x,) — y = F(x1,...,x4), as in 
the case (1). Using the implicit function theorem, we 
may try to reduce our equation to the explicit 
equation of (1), and then perform a linear approx- 
imation involving VF... 0). Assuming that a priori 
we know a value y" such that for 
(x2,...,29), (y9,20,...,x0) € M, this reduction is 
possible, locally around (y?,x9,...,x9), under the 
condition that 


In this situation 


OG OG 
VF(so. x9) TS e: bas a) 
OG 
i (Pa...) / BE Pass) 


Now, as it is normally the case, when they come 
from observation, the variables x1,...,x, are known 
with an estimate and one sees that the larger 


Ze 0G 0G 
(Sese) Oah.) By 0 iex) 


is, the worse the estimate on y near y. 

Furthermore, assuming that M is locally a graph 
of a function (x1,...,x,) > y — F(xi,...,x4), for a 
given (X1,...,X,), the exact expression of 
y= F(x1,...,x4) and consequently the exact value 
of VF...) is not possible to obtain; we have to 
approach it using an algorithm (classically the 
Newton algorithm), and closer 


is to 0, the more such an algorithm is unstable. 

Finally, in the case (3), skipping technical details, 
we encounter the same type of difficulties: we have 
to avoid small values for some gradient functions at 
a given point, in order to obtain, locally at some 
point (x?,...,x°), in a stable way, reliable informa- 
tion on y in terms of (x1,...,x,). 

To sum up, the prediction of a physical phenom- 
enon by a mathematical law greatly depends not 
only on the noncancellation of some gradient 
functions, but, as we deal with approximations and 
algorithms, on how different those gradient func- 
tions are from zero. 

This principle, of course, extends directly to 
applied problems (see the last of our examples in 
the final section): being close to singular values 
essentially means that the control (e.g., of the 
positions of some device by a manipulator) is poor. 

The geometric counterpart of this analytic phe- 
nomenmon is called *transversality," the condition 
for some function G to have a nonzero partial 
derivative 


OG 


0 10 0 
5, | nd 
is equivalent to the condition 
+1 
VG (yo, ERE A e Ox; TX Xpn = R" 
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Figure 1 Transversality of the manifold M and Oy. 


or to the condition 


sees 


space of parameters Onn =: .X at (y" E-m . 
that M is transverse to Oy at (y?, x/,.. x9). 
For some quantity e > 0, the sündition that 


OG OG 
E.. 0 
G 


32), Or 


means that the angle a= (V Go, y». 
or the angle 8 — (T 
(see Figure 1). 

Our purpose in the sequel is to indicate how we 
can quantify the situations described above (the 
defect of transversality), in order to generically or 
almost generically avoid them with quantified 
accuracy. 


4,39» Ox1 °° T Ia.) 
aM, Oy) i is smaller than e€ 


39,35, 8) 


Quantifying Transversality 


Given two submanifolds M and N of the Euclidean 
space R", we can measure the transversality defect 
of (M,N) at x € R" with a differential criterion, 
both analytical and geometric. 

Let us first introduce some notations. For a given 
linear map L: R" — R?, the image by L of the unit 
ball of R" is an r-dimensional ellipsoid in R^ with 
semi-axes denoted as /;(L) > --- > L(L), where r is 
the rank of L. For r<p, we denote l,,1(L) — 

.,,(L) =0 

Now, let x € MAN; let 7: R" — T,N- be the 
projection onto the orthogonal space of TN, p =n — 
dim (N) and mm the restriction of 7 to M. 


Definitions 


We say that (M, N) is transverse at x, and we denote 
it by M Ax N, if and only if my is a submersion at x, 
that is, Drimx) : TM — TEN is onto. 

For a given A=(€1,...,€p), €1 2. -:- > Ep, we say 
that (M,N) is A-nontransverse at x, and we denote it 
by MSN, if and only if li(Drmx)) < 
HF PES ESPERE 17 

With these notations, we have: M (ff «N (1.e., (M, N) 
nontransverse at x) if and only if x ¢ M N N or M (ff ^N, 
for some A with e; —0, and the more (M,N) is A- 
nontransverse, with A close to (€;,..., €p—1, 0), the less 
the manifolds M and N seem transverse at x € M A N 
(see Figure 2). 

The final step. in our formalism to give a convenient 
quantitative approach of transversality is the following: 
let X, Y be two (real) Riemannian manifolds, f : X —^ Y 
a (smooth) mapping, N C Y a submanifold of Y with 
codimension p in N, y € N, and 6:0— R? a 
submersion, where O is an open neighborhood of x in 
Y, such that $^! ((0]) = N N O. Then we say that (f, N) 
is transverse at x, and we denote it by fN, if and only 
if f o ® is submersive in x. 

For a given À —(e1,...,€5), €1 2. "4 > €p, We say 
that (f, N) is (®,A)-nontransverse at x, and we 
denote it by fif N, if and only if L(D[f o ®],,)) 
«eic {1,. ws 

Clearly, we recognize the definition of transvers- 
ality and of A-nontransversality of two submani- 
folds M,N of R” by letting f:M — R” be the 
inclusion and ® = rm (for more details on transvers- 
ality and stability, see, e.g., Golubitski and Guille- 
min (1973 )). 

With the definitions and notations above, our 
general problem may be posed as follows: 


For a C*-regular (k € NU (oc]) mapping f: R" — R^ 
and a given A=(e1,...,€), how large is the set 
A(f,B,, A) -f(X(f, Bj, A), where X(f,B,, A)— [x € 
B, C R”; li(Dfix) < 6j Vi € pw á pl! and B, is a ball 
of radius r in R"? 


Figure 2 Almost-nontransversality of M and N. 
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The “bad” set A(f,B,,A) is called the set of 
A-almost critical values of f (restricted to B,). Our 
purpose is to show that one can control its size in 
terms of k and A. However, before explicitly stating 
quantitative results, let us precise what we under- 
stand by “big set” or by “size of a set.” 


Measure and Dimensions 


We have a very natural way to measure a subset A 
of a metric space. To do this, we consider a > 0 a 
real number and we denote 


Aem {Dis C U D; and |Dj| < | 


IEN 


where |D;| is the diameter of Dj, 


H°(A) = off IDil"; (Di)iex € ^| 


iEN 
and 
H^(A) = lim H(A) € RN {oo} 
H°(A) is called the a-dimensional Hausdorff 


measure of A. It appears that when H(A) 4 oc, 
H°(A)=0 for a’ >a, and when H(A) £0, 
^*(A)—oo for a! <a. This gives rise to the 
following definition of the Hausdorff dimension 
of A: 


dimn(A) = inf(o; H^(A) = 0} 
= sup{a; H(A) = oc] 


The Hausdorff dimension generalizes the classical 
notions of dimension, for instance, when A is a 
subset of R”,dimy (A) € n, a d-dimensional mani- 
fold has Hausdorff dimension d, and "(A) is the 
same as the Lebesgue measure £, of A (for a very 
large class of subset A, which we do not describe 
here. For more details on geometric measure theory, 
see Falconer (1986) and Federer (1969)). 

Another convenient notion of dimension is the 
(metric) entropy dimension. Let us briefly define it. 
For a bounded subset A in some metric space and a 
real number o » 0, we denote M(o, A) the minimal 
number of closed balls of radius < o, covering A. 
H, (A) = log; (M(a, A)) is called the a-entropy of the 
set A. This terminology was introduced in 
Kolmogorov and Tihomirov (1961) and reflects the 
fact that H4(A) is the amount of information needed 
to digitally memorize A with accuracy a. The 


entropy dimension of A, dim, (A), is the order of 
M(a, A) as a — 0. Precisely, 


; i log(M(a, A)) 
dim,(A) = | 一 全 一 一 一 -一 
a WRI) 
= inf(ó; M(a, A) € (1/a)’, 
for sufficiently small a} 


We clearly have 
dimnx(A) € dim,(A) 


For any bounded set A in R”, we can bound M(a, A) 
from above by a polynomial in 1/a (see Ivanov 
(1975) and Yomdin and Comte (2004)): 


where c(z) only depends on n and Vi(A) (the ith 
variation of the set A) is the mean value, with 
respect to P (for a suitable measure), of the number 
of connected components of A N P, with P an affine 
(n — i)-dimensional space of R”. 

Since for A contained in a d-dimensional mani- 
fold, V;(A)=0 for i>d, we deduce from this 
inequality that in this case M(a,A) is bounded 
from above by a polynomial of degree < d in 1/a. 

Our goal is to explain that we can be more precise 
than this general inequality when A is a set of 
critical or almost-critical values of a C mapping. 


Transversality Is a Generic Situation 


The results in this section concern critical values, 
and not almost-critical values. They show that a 
“generic” point of the target space is not a critical 
value, and the more regular, the mapping the 
smaller the set of critical values. Such theorems 
relating the regularity of a mapping and the size of 
its critical values are called Morse-Sard type 
theorems (see Sard (1942, 1958, 1965)). The 
simplest theorem in this direction is the following: 


Theorem 1 (C*  Morse-Sard theorem) (Morse 
1939, Sard 1942, Holm 1987). Let f:R” 一 R? 
be a C*-regular mapping. Then H?(A(f,B,))=0, 
where A(f, B,) — f(X(£, B.)) and E(f, B,) is the set of 
points x € B, where rank(Df\x)) < p. 


The set A(f, B,) is the image, under f, of the points 
of the ball B, in the source space at which f is not 
submersive, that is, the set of critical values of f. 
Consequently, the Morse-Sard theorem ensures that 
for almost all points y in the target space, f—'({y}) is 
either empty or a smooth submanifold of the source 
space of dimension 7 — p. 
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Note that A(f, B,) — A(f, B,, A) for some conve- 
nient A=(€,...,€) with — €,—0, because 
x — li(Dfiw) is bounded on B,, for all i € {1,..., p}. 

Now, we can concentrate our attention on more 
singular points than the critical ones, those at which 
the rank p of f is prescribed. Let us denote such 
points by A?(f,B,), for p«p. By definition, 
AP(f, B,) — f(X"(f, B,), where X/(f, B,) {x € B, C 
R";rank(Df(4) < p]. With these notations, the result 
for rank-r critical values is the following: 


Theorem 2 (C' Morse-Sard theorem for rank-r 
critical values) (Federer 1969). Let f: R" 一 R? be 
a  C*-regular mapping. Then Het”) 
(A?’(f, B,)) =0. In particular, 


dimy(A?(f,Br)) < p+" 


One can produce examples showing that the 
bound of Theorem 2 is the sharpest one (see 
Comte (1996), Whitney (1935), Grinberg (1985), 
and Yomdin and Comte (2004)). 

We note that Theorem 1 is a corollary of Theorem 2 
(just replace k by oc and p by p — 1 in Theorem 2). 
This result tells nothing about the entropy dimen- 
sion of A^(f, B,; in the next section, we will 
bound the growth of entropy of almost-critical 
values. 


Almost-Transversality Is Almost Generic 


In this section, f: R" — R^ is a Cf mapping. We 
denote by K a Lipschitz constant of D^-!f on B, and 
by R,(f) the quantity (K/(k — 1)!) - ^. We have: 


Theorem 3 (C* quantitative Morse-Sard theorem) 
(Yomdin 1983 Yomdin and Comte 2004). Let 
f:R"—R? be a c mapping, A=(€1,..-5€p)s 
el 2: 2 €p, and let us denote €; —1. We have 
(for a € R,(f)): 


M(a, A(f, By, A)) 


<= E Ya , «(2 (A) (n—i)/k 


I 
wbere C is a constant depending only on n, p, and k. 


As a corollary, one can bound the entropy 
dimension of A?(f,B,) by p+ (n — p)/k, and hence 
its Hausdorff dimension, again finding Theorem 2: 
we just have to put c,,1—0 and «,...,¢, large 
enough, that is, c; > Aj(Dfuj), for all x € B,, in 


Theorem 3, to obtain: 


Theorem 4 (C entropy Morse-Sard theorem) 
(Yomdin 1983 Yomdin and Comte 2004). Let 
f :R" 5 R? be a Ct mapping, let us denote «9 —1 


and e; = sup {A;(Df{x));x € B,}, for i € (1,...,p). We 
have (for a < R,(f)): 


p 


M(a, A?(f,B,)) € C- ^ eo -« (2) (D (n—i)/k 


i-0 o 


where C is a constant depending only on n, p, and k. 
In particular, 


dim; (A^(f, B,)) < dim,(A^(f, B,)) < p+" 


Again we have examples showing that this bound 
is sharp (see Yomdin and Comte 2004). 

Furthermore, the mapping f in Theorems 2-4 may 
be of real differentiability class (Holder smoothness 
class C*), with the same conclusions in these 
theorems. That is, k may be a real number written 
as k— p 4- B. with 2 € [0,1], p € NW0), and f is C* 
means that f is p times differentiable and there exists 
a constant C > 0 such that for all x, y € B,,|| Df. 一 
DP? fy) || € C- ||x — yll^ (see Yomdin and Comte 
(2004)). 


Examples 


Let us denote by A the set of real polynomial 
mappings of degree d and of the following type: 


d 
x O(a,x) —1 十 ajax 
j-1 


with a—(a;,...,a4) and llall € 1 (where |||| is the 
Euclidean norm of R^). We identify the set A with 
B4(0, 1) — (a € R^; ||a|| € 1]. 

We want to bound the a-entropy of the set of 
such polynomials for which the real roots are 
multiple or almost multiple. 

We denote by V the set V-—((a,x) € R^. 
O(a, x) — 0). At points (a, x) of V with VO, x) £ 0, 
V is a C* manifold of codimension 1 of R^'!, 
We denote by V™8={(a,x) € V;VQiax £ 0) 
and by V*"£—((ag,x)€ V; VOu a —0]2 VV V's. 
By Whitney (1957), V*"$ is a union of smooth 
manifolds of dimension < d - 1. 

A root x of a polynomial O(a, - ) is multiple if and 
only if 


Oa, x) = (a,x) zs t) 


Consequently, the set A* of polynomials of A 
with multiple roots is z(V*^5) U A(z;y«), where 
1: R^! 5 Rf is the standard projection z(a, x) — a, 
and Al(Tlvns) is the set {(a,x) € V8; Ox C Tia V'*8] 
of critical values of ves. By Sard’s theorem 


(Theorem 2), dimy(A(myre))<d—1. Since 
dimy (z(V97£)) < d — 1, we obtain: dimy (AP) < d — 
1: thus, having distinct roots is a generic property. 
Let, as above, A=(e1,...,€y) with ei > =- > €, 
and «9 — 1. A root x of a polynomial O(a,-) € A is 
said to be A-almost multiple if and only if 
O(a,x)=0 and V gf^Ox, that is, (a,x) € V9"8 or 
sin(Ti;,4, V*8$,Ox) € ej. This condition only con- 
cerns e; and we can take & = --- —e4 4,—1. We 
denote A™ to be the set of polynomials of A with 
(at least) a A-almost multiple root. By Theorem 3, 


d—1 B) 51 
x e 6 « i 

ied We 

But z(V?"&) being a finite union of manifolds of 
dimension at most d — 1, we finally obtain 


M(o, A^) < C'. Y BEZ oj 


Thus, having no A-almost multiple root is A-almost 
a generic property. In Figure 3, we represent V for 
d=3 and d3 — 1, 


M(a, AP^N«(V*8)) < C- 


W= flax) € pas, Pig, = T 


Ox 


Ox 


Figure 3 The space of polynomials of type 1 + a;x + aox + x? 
with almost-multiple roots. 
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Figure 4 Almost-critical points of the distance function of P to 
the origin. 


The next example comes from robotics: let us 
consider a planar robotic manipulator consisting of 
two jointed bars of length a and b, as presented in 
Figure 4. We may parametrize the positions of the 
endpoint P of this device by the angles @ and w (see 
Figure 4). Now the distance r from the origin to P is 
r^ = |P|^ =a? + b? + 2ab cos (v). The critical points 
of r are given by 


z (v) = —2ab sin() = 0 


and correspond to the circle  — 0. The critical value 
of ris a+ b. Near these critical positions, the control 
of r with respect to w is poor; we would like to avoid 
those near-critical values. Given e > 0, the condition 


xa 
dy 


implies || € arcsin(e/2ab), and the e-near-critical 
values of r are 


r- — 7° < 2ab[1 — cos(arcsin(e/2ab))] 


max 


o) <e 


where fmax is a + b; thus, they are contained in an 
interval of length <c-e*/(4ab-rmax), and 
M(a,A(r,€)) < c-/(4ab-rmax-@) (Theorem 3 
gives M(a, A(r, c)) € C(1 + e/a). 


See also: Entanglement; Entanglement Measures; 
Quantum Entropy; Singularity and Bifurcation Theory. 
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Introduction 


If a compact Lie group G acts on a manifold M, the 
space M/G of orbits of the action is usually a singular 
space. Nonetheless, it is often possible to develop a 
"differential geometry" of the orbit space in terms of 
appropriately defined equivariant objects on M. This 
article is mostly concerned with “differential forms 
on M/G." A first idea would be to work with the 
complex of “basic” forms on M, but for many 
purposes this complex turns out to be too small. 
A much more useful complex of equivariant differ- 
ential forms on M was introduced by Cartan (1950). 
In retrospect, Cartan's approach presented a differ- 
ential form model for the equivariant cohomology of 
M, as defined by A Borel (1960). Borel's construction 
replaces the quotient M/G by a better-behaved (but 
usually infinite-dimensional) homotopy quotient Mc, 
and Cartan's complex should be viewed as a model 
for forms on Mc. , 
One of the features of equivariant cohomology are 
the localization formulas for the integrals of equivar- 
iant cocycles. The first instance of such an integration 
formula was the “exact stationary phase formula,” 
discovered by Duistermaat and Heckman. This 
formula was quickly recognized by Berline and 
Vergne (1983) and Atiyah and Bott (1984), as a 
localization principle in equivariant cohomology. 
Today, equivariant localization is a basic tool in 
mathematical physics, with numerous applications. 
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Equivariant Cohomology and the Cartan Model 


This article begins with Borel’s topological defini- 
tion of equivariant cohomology, then proceeds to 
describe H Cartan’s more algebraic approach, and 
concludes with a discussion of localization principles. 

As additional references for the material covered 
here, we particularly recommend books by Berline, 
Getzler, and Vergne (1992) and Guillemin and 
Sternberg (1999). 


Borel’s Model of HG(M) 


Let G be a topological group. A G-space is a 
topological space M on which G acts by transforma- 
tions g — ag, in such a way that the action map 


a:GxM—M [1] 


is continuous. An important special case of G-spaces 
are principal G-bundles E — B, that is, G-spaces 
locally isomorphic to products U x G. 


Definition 1 A classifying bundle for G is a 
principal G-bundle EG — BG, with the following 
universal property: for any principal G-bundle 
E — B, there is a map f:B — BG, unique up to 
homotopy, such that E is isomorphic to the pullback 
bundle f* EG. The map f is known as a “classifying 
map" of the principal bundle. 


To be precise, the base spaces of the principal 
bundles considered here must satisfy some technical 
condition. For a careful discussion, see Husemoller 
(1994). Classifying bundles exist for all G (by a 
construction due to Milnor (1956)), and are unique 
up to G-homotopy equivalence. 

It is a basic fact that principal G-bundles with 
contractible total space are classifying bundles. 


Examples 2 


(i) The bundle R—R/Z=S' is a classifying 
bundle for G — Z.. 

(ii) Let H be a separable complex Hilbert space, 
dim H — oc. It is known that unit sphere S(H) is 
contractible. It is thus a classifying U(1)-bundle, 
with the projective space P(H) as base. More 
generally, the Stiefel manifold St(k, H) of unitary 
k-frames is a classifying U(k)-bundle, with base 
the Grassmann manifold Gr(k, H) of k-planes. 

(iii) Any compact Lie group G arises as a closed 
subgroup of U(k), for k sufficiently large. 
Hence, the Stiefel manifold St(k, H) also serves 
as a model for EG. 

(iv) The based loop group G = LoK of a connected Lie 
group K acts by gauge transformations on the 
space of connections A(S') — OQ! (S', f). This is a 
classifying bundle for LoK, with base K. The 
quotient map takes a connection to its holonomy. 


For any commutative ring R (e.g., Z, R, Z2), let 
H(-;R) denote the (singular) cohomology with 
coefficients in R. Recall that H(-;R) is a graded 
commutative ring under cup product. 


Definition 3 The equivariant cohomology Hg(M) = 
Hg(M; R) of a G-space M is the cohomology ring of 
its homotopy quotient Mg = EG xg M: 


Hc(M; R) = H(Mg; R) [2] 


Equivariant cohomology is a contravariant func- 
tor from the category of G-spaces to the category of 
R-modules. The G-map M — pt induces an algebra 
homomorphism from Hg(pt) = H(BG) to Hg(M). In 
this way, HG(M) is a module over the ring H(BG). 


Example 4 (Principal G-bundles). Suppose E — B is 
a principal G-bundle. The homotopy quotient Ec may 
be viewed as a bundle E xg EG over B. Since the fiber 
is contractible, there is a homotopy equivalence 


Eig =B [3] 
and therefore Hg(E)= H(B). 


Example 5 (Homogeneous spaces). If K is a closed 
subgroup of a Lie group G, the space EG may be 
viewed as a model for EK, with BK = EG/K= EG xx 
(G/K). Hence, 


Hc(G/K) = H(BK) [4] 
Let us briefly describe two of the main techniques 
for computing Hg(M). 


1. Leray spectral sequences. If R is a field, the 
equivariant cohomology may be computed as the 
E. term of the spectral sequence for the fibration 
Mc — BG. If BG is simply connected (as is the 
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case for all compact connected Lie groups), the 
E;-term of the spectral sequence reads 


E?" = H? (BG) @ H*(M) [5] 


2. Mayer-Vietoris sequences. If M=U, UU) is a 
union of two G-invariant open subsets, there is a 
long exact sequence 


.- + H$ (M) = H&(U1) € HG(U2) 一 
一 HE(U1n U2) = He! (M) 5 --- 


More generally, associated to any G-invariant open 
cover, there is a spectral sequence converging to 
Hg(M). 


Example 6 Consider the standard U(1)-action on 
S? by rotations. Cover $^ by two open sets U+, given 
as the complement of the south pole and north pole, 
respectively. Since U+ N U. retracts onto the equa- 
torial circle, on which U(1) acts freely, its equivar- 
iant cohomology vanishes except in degree 0. On the 
other hand, U+ retract onto the poles p+. Hence, by 
the Mayer-Vietoris sequence the map Hf »($?) & 
Hia) e He (p-) given by pullback to the fixed 
points is an isomorphism for k >0. Since the 
pullback map is a ring homomorphism, we conclude 
that Hui (S^; R) is the commutative ring generated 
by two elements x+ of degree 2, subject to a single 
relation xix... — 0. 


q-Differential Algebras 


Let G be a Lie group, with Lie algebra qa. A 
G-manifold is a manifold M together with a 
G-action such that the action map [1] is smooth. 
We would like to introduce the concept of equivar- 
iant differential forms on M. This complex should 
play the role of differential forms on the infinite- 
dimensional space Mg. In Cartan’s approach, the 
starting point is an algebraic model for the differ- 
ential forms on the classifying bundle EG. 

The algebraic machinery will only depend on the 
infinitesimal action of G. It is therefore convenient 
to introduce the following concept. 


Definition 7 Let q be a finite-dimensional Lie 
algebra. A q-manifold is a manifold M, together with 
a Lie algebra homomorphism a:q — X(M),£— a; 
into the Lie algebra of vector fields on M, such that 
the map q x M 一 TM,(£,m) 一 a¢(m) is smooth. 


Any G-manifold M becomes a q-manifold by 
taking a; to be the generating vector field 


d 
ag(m) := dz dex cio m) [6] 
> 一 
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Conversely, if G is simply connected, and M is a 
q-manifold for which all of the vector fields ae are 
complete, the q-action integrates uniquely to an 
action of the group G. 

The de Rham algebra (O(M),d) of differential 
forms on a q-manifold M carries graded derivations 
Le=L(ae) (Lie derivatives, degree 0) and Le — (ae) 
(contractions, degree —1). One has the following 
graded commutation relations: 

Id, d| = 0, 


[Led] 20, — [ted] = Le [7| 


test] 0. (Ley Ly] = Lgs, 


[Le, bn] = Len, [8] 
More generally, the following definitions are 
introduced. 


Definition 8 A q-differential algebra (q-da) is a 
commutative graded algebra A= CD; A", equipped 
with graded derivations d, Le, ie of degrees 1,0, —1 
(where Le, ve depend linearly on € € g), satisfying the 
graded commutation relations [7] and [8]. 


Definition 9 For any q-da A, one defines the 
horizontal subalgebra Ajo, = (, ker(vg), the invar- 
iant subalgebra .4*— Neker(Le), and the basic 
subalgebra Apasic = Ahor N AS: 


Note that the basic subalgebra is a differential 
subcomplex of A. 


Definition 10 A connection on a q-da is an 
invariant element 0 € A! @q, with the property 
ie0 = €. The curvature of a connection is the element 
F’ c A? @q given as F? = dé + (1/2)[0, Aq. 


q-da’s A admitting connections are the algebraic 
counterparts of (smooth) principal bundles, with 
Apbasic Playing the role of the base of the principal 
bundle. 


Weil Algebra 


The Weil algebra Wa is the algebraic analog to the 
classifying bundle EG. Similar to EG, it may be 
characterized by a universal property: 


Theorem 11 There exists a q-da Wa with 
connection Ow, having the following universal 
property: if A is a q-da with connection 0, there 
is a unique algebra homomorphism c: Wq — A 
taking Oy to 8. 


Clearly, the universal property characterizes Wg 
up to a unique isomorphism. To get an explicit 
construction, choose a basis {e,} of a, with dual 
basis {e°} of q*. Let y^ € A!g* be the corresponding 


generators of the exterior algebra, and v^ € S'q* the 
generators of the symmetric algebra. Let 


W"a — Ch Siq* Q Ni q* [9] 


2i+j=n 


carry the differential 


dy’ — y’ 4 Ifa yoy [10] 
di^ = -fguby (11 


where ff. — (e^, [|ep,e.];) are the structure constants 
of a. Define the contractions ta = te, by 

ay’ = 68, — uw -0 [12] 
and let La —[d,/;]. Then La are the generators for 
the adjoint action on Wa. The element Ow — y^ & 
e, € W'q@q is a connection on Wa. Notice that 
we could also use y^ and dy^ as generators of Wa. 
This identifies Wg with the Koszul algebra, and 
implies: 


Theorem 12 Wa is acyclic, that is, the inclusion 
R — Wg is a homotopy equivalence. 


Acyclicity of Wa corresponds to the contractibil- 
ity of the total space of EG. 

The basic subalgebra of Wa is equal to (Sq*)", and 
the differential restricts to zero on this subalgebra, 
since d changes parity. Hence, if A is a q-da with 
connection, the characteristic homomorphism 
c: Wa —.A induces an algebra homomorphism, 
(Sa*)* 一 H(Apasic)» This homomorphism is indepen- 
dent of 6: 


Theorem 13 Suppose 09,0, are two connections on 
a q-da A. Then their characteristic homomorphisms 
co, c1 : Wa — A are q-homotopic. That is, there is a 
chain homotopy intertwining contractions and Lie 
derivatives. 


Remark 14 One obtains other interesting exam- 
ples of q-da’s if one drops the commutativity 
assumption from the definition. For instance, 
suppose q carries an invariant scalar product. Let 
Cl(q) be the corresponding Clifford algebra, and 
U(q) the enveloping algebra. The noncommu- 
tative Weil algebra (introduced by Alekseev and 
Meinrenken 2002) 


Wg = Ug ® Cl(g) [13] 
is a (noncommutative) q-da, with the derivations d, 


La, ta defined on generators by the same formulas as 
for Wa. 


Equivariant Cohomology of q-da’s 
In analogy to HG;(M):— H(Mg), we now declare: 


Definition 15 The equivariant cohomology algebra 
of a q-da A is the cohomology of the differential 
algebra Ag:=(Wq ® A)pasic? 


Ha(A) := H(A) [14 


The equivariant cohomology H,(A) has functorial 
properties parallel to those of Hg(M). In particular, 
H(A) is a module over 


H4(10]) = H((Wa)pasic) - (Sa*)? [15] 


Theorem 16 Suppose A is a q-da with connection 
0, and let c: Wa — A be the characteristic homo- 
morphism. Then 


Wa & A — A, 


is a Q-homotopy equivalence, with g-homotopy 
inverse the inclusion 


w&x = c(w)x [16] 


A — Wa 8A, x>1@x [17] 
In particular, there is a canonical isomorphism 
H (Aga)  Hy(A) 18] 


Proof By Theorem 13, the automorphism w ® 
xt+1@c(w)x of Wa® A is g-homotopic to the 
identity map. 

[] 


The above definition of the complex Ag is often 
referred to as the Weil model of equivariant 
cohomology, while the term Cartan model is reserved 
for a slightly different description of Ag. Identify 
the space (Sq* ® A)? with the algebra of equivariant 
A-valued polynomial functions a:q — A. Define a 
differential da on this space by setting 


(dga) (€) = d(a(&)) — rea) [19] 


Theorem 17 (H Cartan). The natural projection 
Wa Q A — Sq’ Q A restricts to an isomorphism of 
differential algebras, Ag S (Sa* ® A)’. 


Suppose A carries a connection 0. The q-homotopy 
equivalence [16] induces a homotopy equivalence 
Ag — Abasic Of the basic subcomplexes. By explicit 
calculation, the corresponding map for the Cartan 
model is given by 


(Sg* ey A)? — basic: 


Here a(F’) € .A* is the result of substituting the 
curvature of 0, and Phor: A —> Ajo, is horizontal 
projection. On elements of (Sq*) C (Sa* &$.A)*, 


a e» Phor(a(F )) [20] 
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the map [20] Chern-Weil 
homomorphism. 
There is an algebraic counterpart of the Leray 


spectral sequence: introduce a filtration 


specializes to the 


FP AP? := CD((S'a* & A)? [21] 


2i2p 


Since second term in the equivariant differential 
[19] raises the filtration degree by 2, it follows that 


E5* = (S""^g** & H(A) [22] 


for p even, E? =0 for p odd. In fortunate cases, the 
spectral sequence collapses at the E;-stage (see 
below). 


Equivariant de Rham Theory 


We will now restrict ourselves to the case that 
A — O(M) is the algebra of differential forms on a 
G-manifold, where G is compact and connected. 


Theorem 18 (Equivariant de Rham theorem). Sup- 
pose G is a compact, connected Lie group, and 
that M is a G-manifold. Then tbere is a canonical 
isomorpbism 


Hc(M; R) = H4(9(M)) [23] 


where the left-hand side is the equivariant cohomol- 
ogy as defined by the Borel construction. 


Motivated by this result, the notation can be 
changed slightly; write 


Q¢(M) = (Sa* @ Q(M))* [24] 


for the Cartan complex of equivariant differential 
forms, and dg for the equivariant differential [19]. 


Remark 19 Theorem 18 fails, in general, for 
noncompact Lie groups G. A differential form 
model for the noncompact case was developed by 
Getzler (1990). 


Example 20 Let (M,w) be a symplectic manifold, 
and a: G — Diff(M) a Hamiltonian group action. 
That is, a preserves the symplectic form, du) = w, 
and there exists an equivariant moment map 
$:M — g* such that w + d($,£) 20. Then the 
equivariant symplectic form wg(£):— w+ (6,£) is 
equivariantly closed. 


Example 21 Let G be a Lie group, and denote, 
respectively, by 


0t — g^! dg and 0^ = dgg ^! [25] 
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the left- and right-invariant Maurer—Cartan forms. 
Suppose q=Lie(G) carries an invariant scalar 


product “-”, and consider the closed 3-form 
b= d 0L poa] 26 
Then 
clé) = +30" +O) -€ [27] 


is a closed equivariant extension for the conjugation 
action of G. More generally, transgression gives 
explicit differential forms ¢; generating the coho- 
mology ring H(G)=(A q*)°. Closed equivariant 
extensions of these forms were obtained by Jeffrey 
(1995), using a construction of Bott-Shulman. 


A G-manifold is called equivariantly formal if 
Hc(M) = (Sq*)° & H(M) [28] 


as an (Sq*)°-module. Equivalently, this is the 
condition that the spectral sequence [22] for 
Hg(M) collapses at the E2-term. M is equivariantly 
formal under any of the following conditions: (1) 
H?(M) — 0 for q odd, (2) the map Hg(M) 一 H(M) is 
onto, (3) M admits a G-invariant Morse function 
with only even indices, and (4) M is a symplectic 
manifold and the G-action is Hamiltonian. (The last 
fact is a theorem due to Ginzburg and Kirwan. 


Example 22 The conjugation action of a compact 
Lie group is equivariantly formal, by criterion [2]. In 
this case, eqn [28] is an isomorphism of algebras. 


It is important to note that eqn [28] is not an 
algebra isomorphism, in general. Already the rota- 
tion action of G=U(1) on M—S?, discussed in 
Example 6, provides a counter-example. 


Theorem 23  (Injectivity). Suppose T is a compact 
torus, and M is T-equivariantly formal. Then the 
pullback map Hr(M) 一 Hr(M’) to the fixed point 
set is injective. 


Since the pullback map to the fixed point set is an 
algebra homomorphism, one can sometimes use this 
result to determine the algebra structure on H7(M): 
let a, € H(M) be generators of the ordinary coho- 
mology algebra, and let (a,)r be equivariant exten- 
sions. Denote by x, € Hr(M!) the pullbacks of (av)7 
to the fixed point set, and let y; be a basis of t', 
viewed as elements of St* c H7(M'). Then Hr(M) 
is isomorphic to the subalgebra of HT(M') gener- 
ated by the x, and y;. 

The case of nonabelian compact groups G may be 
reduced to maximal torus T using the following result. 
Observe that for any G-manifold M, there is a natural 
action of the Weyl group W = N(T)/T on HT(M). 


Theorem 24 Tbe natural restriction map 
HcG(M;R) > Hr(M; R)" [29] 


onto the Weyl group invariants is an algebra 
isomorphism. 


Remark 25 The Cartan complex [24] may be viewed 
as a small model for the differential forms on the 
infinite-dimensional space Mg. In the noncommuta- 
tive case, there exists an even “smaller” Cartan model, 
with underlying complex (Sq*)° & 2(M)°, involving 
only invariant differential forms on M (see Alekseev 
and Meinrenken (2005) and Goresky, Kottwitz, and 
MacPherson (1998)). 


Equivariant Characteristic Forms 


Let G be a compact Lie group, and E — B a 
principal G-bundle with connection 0 € Q' (E) & q. 
Suppose the principal G-action commutes with the 
action of a compact Lie group K on E, and that @ is 
K-invariant. The K-equivariant curvature of @ is 
defined as follows: 


Fe = dx 4-1(0,0] € NK(E)® q 


By the equivariant version of eqn [20], there is a 
canonical chain map 


Okxc(E) 一 Qx(B) [30] 


defined by substituting the K-equivariant curvature 
for the q-variable, followed by horizontal projection 
with respect to 0. The Cartan map [30] is homotopy 
inverse to the pullback map from O&(B) to QkxG(B). 


Example 26 The complex Qxyg(E) contains a 
subcomplex (Sq*)°. The restriction of eqn [30] is 
the equivariant Chern-Weil map 


(Sq*)? 一 Q&«(B) [31] 


Forms in the image of eqn [31] are equivariantly 
closed; they are called the K-equivariant character- 
istic forms of E. 


Example 27 Similarly, if V — B is a K-equivariant 
vector bundle with structure group G C GL(k), one 
defines the K-equivariant characteristic forms of Y 
to be those of the corresponding bundle of G-frames 
in V. 


For instance, suppose Y is an oriented K-equivar- 
iant vector bundle of even rank k, with an invariant 
metric and compatible connection. The Pfaffian 
defines an invariant polynomial on 80(k): 


Ç det (C/2m) [32] 


(equal to 0 if k is odd). The K-equivariant 
characteristic form of degree k on B determined by 
eqn [32] is known as the equivariant Euler form 


Eulx(V) € Q£(B) [33] 


Similarly, one defines equivariant Pontrjagin forms 
of V, and (for Hermitian vector bundles) equivariant 
Chern forms. 


Example 28 Suppose G is a maximal rank sub- 
group of the compact Lie group K. The bundle K — 
K/G admits a unique K-invariant connection. 
Hence, one obtains a canonical chain map (Sq*)° — 
Qx(K/G), realizing the isomorphism Hx(K/G) = 
(Sa*)". In particular, any G-invariant element of q* 
defines a closed K-equivariant 2-form on K/G. For 
instance, symplectic forms on coadjoint orbits are 
obtained in this way. 


Suppose M is a G-manifold, and let Q =E xg M 
be the associated bundle. For any K-invariant 
connection on E, one obtains a chain map 


Qoe(M) > Qkxc(Ex M) > 9k(Q) — [34] 


by composing the pullback to Ex M with the 
Cartan map for the principal bundle E x M — Q. 


Example 29 Suppose (M,w) is a Hamiltonian 
G-manifold, with moment map ®:M — q'. The 
image of wg —w- under the map [34] defines a 
closed K-equivariant 2-form on Q. This construction 
is of importance in symplectic geometry, where it 
arises in the context of Sternberg’s minimal 
coupling. 


Equivariant Thom Forms 


Let 7: Y > B be a G-equivariant oriented real vector 
bundle of rank k over a compact base B. There is a 
canonical chain map, called fiber integration 


Ta : Q*(V),, > Q*7*(B) [35] 


where the subscript indicates “compact support.” It 
is characterized by the following properties: 


(1) for a form of degree k, the value of its fiber 
integral at x € B is equal to the integral over the 
fiber V,, and 

(2) 


T(a@A 7B) = Tra ^B [36] 


for all a € O(V),, and 8 € Q(B). Fiber integration 
extends to G-equivariant differential forms, and 
commutes with the equivariant differential. 
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Theorem 30 (Equivariant Thom isomorphism). Fiber 
integration defines an isomorphism, 
He ^ (V)g ^ HE(B) [37] 


An equivariant Thom form for a G-vector bundle 
is a cocycle Thg(V) € NEV) ep, with the property, 


7, Tha(V) =1 [38] 


Given Thc(Y), the inverse to eqn [37] is realized on 
the level of differential forms as 
Q(B) > OF*(E), ary The(V)Anta [39] 
A beautiful “universal” construction of Thom 
forms was obtained by Mathai and Quillen (1986). 
Using eqn [34], it suffices to describe an SO(k)- 
equivariant Thom form for the trivial bundle R^ — 


(0]. Using multi-index notation for ordered subsets 
Ic... RE 


e a C : 
Thso a) (R*)(¢) = uu A erdet (s) (dx)" [40] 


Here the sum is over all subsets J with |/| even, and 
I< is the complement of J. The matrix Cj; is obtained 
from ¢ by deleting all rows and columns that are not 
in I, and det! is defined as a Pfaffian. Finally, e; is 
the sign of the shuffle permutation defined by I, that 
is, (dx)'(dx)' =e; dxi---dx,. As shown by Mathai 
and Quillen, the form [40] is equivariantly closed, 
and clearly eqn [38] holds since the top degree part 
is just a Gaussian. If k is even, the Mathai-Quillen 
formula can also be written, on the open dense 
where C € $0(k) is invertible, as 


Thsow(R9(O = der! (S) e-isf- tne qr 


The form Thsouj(R* ) given by these formulas does 
not have compact support, but is rapidly decreasing 
at infinity. One obtains a compactly supported 
Thom form, by applying an SO(k)-equivariant 
diffeomorphism from R^ onto some open ball of 
finite radius. 

Note that the pullback of eqn [40] to the origin is 
equal to det! (C/27) (equal to 0 if k is odd). This 


implies: 


Theorem 31 Let .:B — Y denote the inclusion of 
the zero section. Then 


The (V) = Eulc(V) [42] 


where Eulg(V) € OF (B) is the equivariant Euler 
form. 


Suppose, M is a G-manifold, and S a closed 
G-invariant submanifold with oriented normal 
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bundle vs. Choose a G-equivariant tubular neigh- 
borhood embedding 


vy —UcM [43] 


and let PDg(S) € Oc(M),, be the image of Thc(Y) 
under this embedding. The form PDg¢(S) has the 


property 
/ PDg(S) Aa = / ite [44] 
M S 


for all closed equivariant forms a € Q¢(M). It is called 
n *equivariant Poincaré dual" of S. By construction, 
the pullback to S is the equivariant Euler form: 


i PDG(S) = Eulg(vs) [45] 


Equivariant Poincaré duality takes transversal inter- 
sections of G-manifolds to wedge products, similar 
to the nonequivariant case. 


Remark 32 In general, the (Sq*)°-submodule gen- 
erated by Poincaré duals of G-invariant submani- 
folds is strictly smaller than HG(M). In this sense, 
the terminology “duality” is misleading. 


Localization Theorem 


In this section, T will denote a torus. Suppose M is a 
compact oriented T-manifold. For any component F 
of the fixed point set of T, the action of T on pp 
fixes only the zero section F. This implies that the 
normal bundle v; has even rank and is orientable. 
Fix an orientation, and give F the induced 
orientation. 

Since T is compact, the list of stabilizer groups of 
points in M is finite. Call € € t generic if it is not in the 
Lie algebra of any of these stabilizers, other than T 
itself. In this case, value Eul7(zr, €) of the equivariant 
Euler form is invertible as an element of Q(F). 


Theorem 33 (Integration formula). Suppose M is 
a compact oriented T-manifold, wbere T is a torus. 
Let a € OT(M) be a closed equivariant form, and let 
E € t be generic. Then 


ES "2. | irre en 


where the sum is over the connected components of 
the fixed point set. 


Rather than fixing £, one can also view eqn (46) 
as an equality of rational functions of € € t. 


Remark 34 The integration formula was obtained 
by Berline and Vergne (1983), based on ideas of Bott 
(1967). The topological counterpart, as a “localiza- 
tion principle," was proved independently by Atiyah 


and Bott (1984). More abstract versions of the 
localization theorem in equivariant cohomology had 
been proved earlier by Borel, Chiang-Skjelbred and 
others. 


Remark 35 If o — PDT(F) ^ B, where 5 is equivar- 
iantly closed, the integration formula is immediate 
from the property [44] of Poincaré duals. The 
essence of the proof is to reduce to this case. 


Remark 36 The localization contributions are 
particularly nice if F={p} is isolated (which can 
only happen if dim M is even). In this case, wra(é) is 
simply the value of the function ojoj(£) at p. For the 
Euler form, one has 


Eul(v;,£) = (-1)"" V^ TT(uj(p),£) — [47] 


where j;(p) Et are the (real) weights of the action 
on the tangent space T,M. (Here we have chosen an 
isomorphism T,M = C! compatible with the orien- 
tation.) Hence, if all fixed points are isolated, 


人 


Example 37 Let M be a compact oriented mani- 
fold, and e(M) = fy Eul( TM) its Euler characteristic. 
Suppose a torus T acts on M. Then 


M) = 》 e(F) [49] 
F 


where the sum is over the fixed point set of T. 
This follows from the integral of the equivariant 
Euler form a(£) - Eulr(M,£), by letting £ — 0 in 
the localization formula. In particular, if M admits 
a circle action with isolated fixed points, the 
number of fixed points is equal to the Euler 
characteristic. 


In a similar fashion, the localization formula gives 
interesting expressions for other characteristic num- 
bers of manifolds and vector bundles, in the 
presence of a circle action. Some of these formulas 
were discovered prior to the localization formula, 
see in particular Bott (1967). 


Example 38 In this example, we show that for a 
simply connected, simple Lie group G the 3-form 
ó € Q?(G) defined in eqn [26] is integral, provided 
*." is taken to be the basic inner product (for which 
the length squared of the short coroots equals 2). 
Since any such G is known to contain an SU(2) 
subgroup, it suffices to prove this for G=SU(2). 
Consider the conjugation action of the maximal 
torus T ~ U(1), consisting of diagonal matrices. The 
fixed point set for this action is T itself. The normal 


bundle vp is trivial, with T acting on the fiber g/t by 
the negative root —o. Hence, Eul(vr,£) — (a, £). 
Let à€t be the coroot, defined by (o,à)-—2. 
By definition, (o, à) —2. Let us integrate the T- 
equivariant extension @7(€) (cf. [27]). Its pullback to 
T is 01 - £, where 0! € Q(T, t) is the Maurer-Cartan 
form. The integral of 0" is a generator of the integral 
lattice, that is, it equals a. Thus, 


Duistermaat-Heckman Formulas 


In this section, we discuss the Duistermaat-Heckman 
formula, for the case of isolated fixed points. Let T 
be a torus, and (M,w) a compact Hamiltonian 
T-space, with moment map 9: M — t'. Denote by 
wT —w-- 9 the equivariant extension of w. Assum- 
ing isolated fixed points, the localization formula 
gives, for all integers k > 0, 


[e+ eor = yt ene ) £) 


where z —(1/2)dim M. Note that both sides are 
homogeneous of degree k — n in £, but the terms on 
the right-hand side are only rational functions while 
the left-hand side is a polynomial. For k=n, both 
sides are independent of £, and compute the integral 
Jyw". For k <n, the integral [51] is zero, and the 
cancellation of the terms on the right-hand side gives 
identities among the weights (p). Equation [51] 
also implies 


[51] 


1 E) 


w+( PE) 4 ^8 
he AF'Y ort me P 


Assume, in particular, that T = U(1), and let £ — ££o, 
where £ is the generator of the integral lattice in t. 
Identify t = R in such a way that £o corresponds to 
1 € R. Then H=(®,&) is a Hamiltonian function 
with periodic flow. Write aj(p)= (uj(p), £o) € Z. 
Then eqn [52] reads 


/ oth U” x (—1)” etHip) [53] 
wn Jie [la;(p) 
The right-hand side of eqn [53] is the leading term 
for the stationary phase approximation of the 
integral on the left. For this reason, eqn [52] is 
known as the Duistermaat-Heckman exact station- 
ary phase theorem. 

Formula [52] has the following consequence for 
the push-forward of the Liouville measure under the 
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moment map, the so-called Duistermaat-Heckman 
measure H.(uw"/n!). Let © be the Heaviside measure 
(i.e., the characteristic measure of the positive real axis). 


Theorem 39 (Duistermaat-Heckman). The push- 
forward H,(w"/n!) is piecewise polynomial measure 
of degree n — 1, with singularities at the set of all H(p) 
for fixed points p of the action. One has the formula 


JA x0 - Hp)" 
n) -X [la;(p) 


p 


O(A—H(p)) 154] 


Proof It is enough to show that the Laplace 
transforms of the two sides are equal. Multiplying 
by e^ and integrating over A (take t < 0 to ensure 
convergence of the integral), the resulting identity is 
just eqn [53]. oO 


Remark 40 The theorem generalizes to Hamiltonian 
actions of higher-rank tori, and also to nonisolated 
fixed points. See the paper by Guillemin, Lerman, and 
Sternberg (1988) for a detailed discussion of this 
formula and of its “quantum analog.” 


Equivariant Index Theory 


By definition, the Cartan model consists of equivar- 
iant forms a(£) with polynomial dependence on the 
equivariant parameter €. However, the integration 
formula holds in much greater generality. For 
instance, one may consider generalized Cartan 
complexes (Kumar and Vergne 1993). Here the 
parameter € varies in some invariant open subset of 
aq, and the polynomial dependence is replaced by 
smooth dependence. The use of these more general 
complexes in equivariant index theory was pio- 
neered by Berline and Vergne (1992). 

Assume that M is an even-dimensional, compact 
oriented Riemannian manifold, equipped with a 
Spin-c structure. According to the Atiyah-Singer 
theorem, the index of the corresponding Dirac 
operator D is given by the formula 


ind(D) — f A(M)e*? [55] 
M 


Here c is the curvature 2-form of the complex line 
bundle associated to the Spin-c structure, and A(M) 
is the A-form. Recall that A(M) is obtained by 
substituting the curvature form in the formal power 
series expansion of the function A(x) = det!/?((x/2)/ 
sinh(x/2)) on $0(n). 

Suppose now that a compact, connected Lie group 
G acts on M by isometries, and that the action lifts to 
the Spin-c bundle. Replacing curvatures with equiv- 
ariant curvatures, one defines the equivariant form 
A(M)(£) and the form c(£). Note that A(£) is only 
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defined for € in a sufficiently small neighborhood of 
0, since the function A(x) is not analytic for all x. 
The G-index of the equivariant Spin-c Dirac operator 
is a virtual character g — ind(D)(g) of the group G. For 
g= exp € sufficiently small, it is given by the formula 


ind(D) (exp £) = 人 ÀME [s6] 


For € sufficiently small, the fixed point set of g 
coincides with the set of zeroes of the vector field de. 
The localization formula reproduces the Atiyah- 
Segal formula for ind(D)(g), as an integral over M8. 

Berline and Vergne (1996) gave similar formulas 
for the equivariant index of any G-equivariant 
elliptic operator, and more generally for operators 
that are transversally elliptic in the sense of Atiyah. 


See also; Cohomology Theories; Compact Groups and 
Their Representations; Hamiltonian Group Actions; 
K-theory; Lie Groups: General Theory; Mathai—Quillen 
Formalism; Path-Integrals in Noncommutative Geometry; 
Stationary Phase Approximation. 
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The main theme of the ergodic theory is to know 
whether averages of quantities generated in a 
stationary manner converge. In the classical situation 
the stationary is described by a measure-preserving 
transformation T, and one considers averages taken 
along a sequence f,f T, f T^,... for integrable f. This 
corresponds to the probabilistic concept of stationar- 
ity. Hence, traditionally, the ergodic theory is the 
qualitative study of iterates of an individual transfor- 
mation, of one parameter flow of transformations 
(such as that obtained from the solution of an 
autonomous ordinary differential equation). We 
should note that an important purpose behind this 
theory is to verify significant facts from a statistical 
point of view (e.g., the law of large numbers, 
convergence to limit distributions). The oldest branch 


of this theory is the study of ergodic theorems. It was 
started in 1931 by Birkhoff (1931) and von Neumann 
(1932), having its origins in statistical mechanics. 
More specifically, the central notion is that of 
ergodicity, which is intended to capture the idea 
that a flow is “random” or “chaotic.” In dealing with 
the motion of molecules, Boltzmann and Gibbs made 
such hypotheses from the beginning. One of the 
earliest precise definitions of randomness of a 
dynamical system was “minimality”: the orbit of 
almost every point is dense. In order to describe such 
phenomena in measure-theoretical setting, von Neu- 
mann and Birkhoff required the stronger assumption 
of ergodicity as follows. Let (X, B, ux) be a measure 
space and F, a measurable flow on X. We call F, 
ergodic if the only invariant measurable sets are () or 
all of X. Here, the invariance of the set A means that 
F,(A) — A for all t € R and we agree to write A — B if 
A and B differ by a null set with respect to jj. Note 
that ergodicity implies minimality if we are on a 
second countable Borel space. A function f: X —^ R 
will be called a “constant of the motion" iff f o F; =f 
a.e. for each t € R. Then we see that a flow F; on X is 
ergodic iff the only constants of the motion are 
constant a.e. In case of a measurable transformation 
T on X, the invariance of the set A means that 
T-!'A=A, and the measurable function f is called 
invariant if f o T —f a.e. Then we call T ergodic 
provided if A is invariant then either 4(A)—O or 
H(A)-—1; equivalently, any invariant function is 
constant a.e. (Cornfeld et al. 1982). The most basic 
example where ergodicity can be verified is the 
following: if M is a compact Riemannian and has 
negative sectional curvatures at each point, then the 
geodesic flow on each sphere bundle is ergodic 
(Hopf-Hadamard). In general, verifying ergodicity 
can still be very difficult. In the Hamiltonian case, the 
first step is to pass to an energy surface. For example, 
Sinai (1970) shows that one has ergodicity on an 
energy surface of a classical model for molecular 
motion, that is, a collection of hard spheres in a box. 


Ergodic Theorems 


Koopman (1931) published the following significant 
observation: if T is an invertible measure-preserving 
transformation of a measure space (X, B, m), then 
the operator U, defined on L^(X,B,u) by 
Uf(x):—f(Tx), is unitary. Thus, the association of 
U with T replaces a nonlinear finite-dimensional 
problem with a linear infinite-dimensional one. 
Then von Neumann (1932) showed an intimate 
connection between measure-preserving transforma- 
tions and unitary operators (the mean ergodic 
theorem): let U be a unitary operator on a Hilbert 
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space H. Denote by P the orthogonal projection 
onto the subspace Ho:={f € H|Uf =f}. For any 
f € H, one has 


lim 


1 N 
人 一 co Re Ure 


=0 
H 


As a corollary, one can show that if T: X — X is 
an ergodic measure-preserving transformation on a 
probability space (X,B,y) then, for any 
f € L'(X, B, p), 


in L'-norm. We also know that T is ergodic if and 
only if U has 1 as a simple eigenvalue. In the case of 
a continuous invertible process, the setting is the 
following. Let M be a manifold and Q a volume on 
M, with jig the corresponding measure. If F, is a 
volume-preserving flow on M, then F, induces a linear 
one-parameter group of isometries on H = L^(M, Ho) 
by U,(f)=foF_; Then U, has 1 as a simple 
eigenvalue for all ż if and only if F; is ergodic. 

On the other hand, Birkhoff (1931) proved the 
following almost everywhere statement (the point- 
wise ergodic theorem): for any f € L'(X, B, ji), there 
exists a function f € L'(X, B, jj) such that for p-a.e. 
x, f Tix) —f(x) and 


1 N E 
lim ^ f(T"x) = f(x) 
N—oc N 
n=0 

In particular, if T is ergodic then p-a.e. 
x,f(x)— Jy fd. Thus, the Birkhoff theorem allows 
one to prove the ergodic hypothesis by Boltzmann- 
Gibbs, that is, the space average of an observable 
function coincides with its time averages almost 
everywhere, and guarantees the existence, for almost 
everywhere, of the mean number of occurrences in 
any measurable set. On the other hand, physical 
meanings of the mean ergodic theorem can be 
explained as follows. We now turn to one-parameter 
flow of transformations. In order to study continu- 
ous averages 


1 t 
:d f (F,x)ds 


fix some so € R and consider the averages of the 
form 


N-1 
pe CES) 
n=() 


4 Lan 


252 Ergodic Theory 


where T — F,,. In reality, the measurements can be 
done only approximately at times £ — 0, 1,..., N — 1, 


and it is natural to consider the perturbed averages 
1 Ne! | 
N >$ F(T* 3) 
n=0 


where {6,},,<n is an independent random sequence in 
a small interval (—e,«). Assuming that T = HF, is 
ergodic, we would like to know whether for large N, 
the averages 


1 N-1 


rs >. f(T” x) 


n=0 


/ f(x)du(x) 
x 


The answer to this question is satisfactory if one 
is concerned with norm convergence (see, e.g., 
Bergelson et al. (1994)). 


are close to 


Induced Transformations and 
Tower Constructions 


Suppose T is a measure-preserving transformation 
on a probability space (X,B,ju) and A € B with 
(A) > 0. Let us transform A into a space with 
normalized measure by choosing the o-algebra B4 
consisting of all subsets E C A, E € B and setting 
LA(E) = (AN E)/pu(A). Let RA: A — NU {oo} be the 
“first return function," that is, Ra(x):= inf {n € 
N|T"x € A}. Then it follows from the Poincaré 
reccurrence theorem that ji,({x € A| RA(x) < 
oo]-1. Define Ta:{x € A|Ra(x)< co} ^A by 
TAx:— TF^'?)x, which is called the “induced trans- 
formation" over A (constructed from T). For each 
n € N we define A,:— (x € A | Ra(x)=n}. Then for 
every E € B4 we see that TA E = t THAT 
E). Hence, if T is invertible, then. we have 
immediately jj4(TA ! E) =44(E); thus, ya is invari- 
ant under T4. Even if T is noninvertible, since for 
every k > 1 the equality, 


k—1 
n (U TAA AT AN E) 


j=0 
= (Apu MT CDE) 
k 
+u (Ù TIA nT NAN 日 
j=0 


holds, we have (E)= > u(AgNT-*E)= 
u(T, E), which allows us to see that T4 preserves 
L4. We note that for every E € Ba with a(E) > 0, 


{n e N|T”x € E}={n € N|TA"x € E. Therefore, 
for ae x EB yay lel "= SA eTa (e 
This equality allows us to see that if (T, jz) is ergodic 
then (T4, 4,4) is ergodic. Indeed, suppose T, |E-E 
and (ANE) > 0. Then for x € ANE‘, we have 
Tat teTa’(x)=0. On the other hand, as EC 
LET "E (modu LET "EU ST "E isa 
T-invariant set. Hence, ergodicity of (T, 1) allows us 
to see that LJ] , T "E— X (umodO), which implies 
os 1 IET "(x) = oo. In the case when T is invertible, 
we can write [4 RA du = u( U,,.9 T"A), so that Kac's 
formula (Darling and Kac 1957): 


f Ra dug = AY" 
A 


is valid when (J,so T"A =X (mod jz). In particular, 

W(U,>9 T"A)—1 if T is ergodic. The key to 
ct the Kac formula is to show that T'A,(0 > 
i> k— 1,k > 1) are pairwise disjoint. This property 
holds when T is invertible. On the other hand, 
in the case when T is noninvertible, if 
Uz oT "A-— X(umod 0) then we can establish, 
for every E € B, 


Ra(x)- ] 


j(E) = A) f Ð amd 上 


by noting that the following equality holds for all 
"nl 


«5 -Yau [Aor "Ant 2 


k=1 b 


non cr) ori 


j=0 


Then choosing E = X allows one to establish the Kac 
formula. As we have observed in the above, the 
assumption that |) 9 T "A — X(umodO) is auto- 
matically satisfied if (T, jj) is ergodic. Conversely, if 
(Ta, ua) is ergodic and (J? ., T "A(mod p) holds, 
then (T,4) is ergodic. We should remark that the 
formula [1] allows one to obtain a T-invariant 
measure when a Ta-invariant measure jl, is 
obtained previously. Even if Ra is nonintegrable, 
we may have a o-finite infinite invariant measure. 
Then if p4 is ergodic, 1 obtained by [1] is still 
ergodic (i.e, T'E=E implies that p(E)=0 or 
p(E*) — 0) under the assumption that 
LU, 1T "A-— X(mod u) (cf. Aaronson (1997)). In 
particular, the recent progress in the study of 
nonhyperbolic systems strongly depends on such 
constructions of induced maps over hyperbolic 
regions. More specifically, if one can find a subset 
A over which the induced map possesses an 


invariant measure satisfying nice statistical proper- 
ties, then the formula [1] may give a o-finite 
invariant measure / for the original map T which 
reflects the statistical properties of the induced 
system. The fundamental problem in the study of 
nonhyperbolic phenomena arising from complex 
systems is to clarify how to predict statistical 
properties of nonhyperbolic systems (T,j) by 
using those of induced systems (T4,j4) over 
hyperbolic regions. We should claim that induced 
maps are well defined over positive-measure sets 
with respect to a reference measure v that is 
“conservative.” Here conservativity of (T,v) 
implies that there are no wandering sets of positive 
measure with respect to v. In many cases, the 
reference measures are physical measures (e.g., 
Lebesgue measures, conformal measures) which 
satisfy nonsingularity with respect to T. Here 
nonsingularity of v means that vT! ~v. Then as 
long as we obtain a T4-invariant measure ja which 
is equivalent to v| A, the formula [1] may give us 
a T-invariant o-finite measure which is equivalent 
to lv. 

At the end of this section, we will explain that the 
folmula [1] can be obtained via Rohlin tower 
(Kakutani’s skyscraper) in the case when T is 
invertible. This tower construction is a dual con- 
struction to the construction of induced transforma- 
tions. Assuming that we are given an invertible 
transformation T of the measure space (X,B, i), 
consider the measurable integer-valued positive 
function f € L'(X,B,). By using this function, 
construct a new measure space X/, whose points 
are of the form (x, i), where x € X,1 € i € f(x) andi 
is an integer. The o-algebra B/ of measurable sets in 
X/ is constructed in an obvious way. The measure 
uÍ is defined as follows: for any subset of the form 
(A,i), A € B we put 


f Fa oe (A) 
p! ((A,i)) := = 
MEET 
Let 
fr. x Sii fiti afla) 

Pens ot ifi+1> f(x) 
It is easy to see that T/ preserves ju’. The space can 
naturally be visualized as a tower whose foundation 
is the space X and which has f(x) floors over the 
point x € X. The space X is identified with the set of 
points (x,1). We see that T=(T’)y and the 
construction of (X/, T/) is called the Rohlin tower 
over X. Let T be an invertible measure-preserving 
transformation on a probability space (X,B, ju) 
and AEB with p(A)>O0. Suppose that 
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X=U*_,T”"A (mod). Then T is represented as 
the Rohlin tower (ARa, BRA (ja) 4) OVer 4 as 
follows. We define p: (A^, B^, (1.4)**) 一 (X, B, 4) 
by p(x,i):— T'x. Then p is an isomorphism satisfy- 
ing p(T^^), — Tp (almost everywhere). Moreover, 
we can verify that (u4) "^p^, by assuming 
ergodicity of u. This is because VE € B we have 


(Č ma) ‘ar 
n—1 


p((An N T^E) x {i}) 


On the other hand, in the case when T is 
noninvertible, the formula [1] is not necessarily 
obtained by any tower construction, except in very 
special cases. For example, even if T is not 
invertible, the tower construction is valid if T|, 
and T|,. are one-to-one and TA =X. 


Convergence to Equilibrium States 
and Mixing Properties 
Let T: X — X be a measure-preserving transforma- 


tion on a probability space (X, B, 1). We call T to be 
“weak mixing” if for any A,B € B 


1 N-1 


lim 5 


u(T-"A MB) - (Ado) -0 


The weak-mixing property of (T,) can be repre- 
sented by; Vf,g € L^(X, B, ji) 


[ ren - [ fan f gan =0 


and this is equivalent to the ergodicity of (T x 
T,u x u). Moreover, (T, u) is weak mixing if and 
only if the unitary operator U:H — H defined by 
Uf(x)—f(Tx) has no eigenfunctions that are not 
constants (jj mod 0). We say that the operator U has 
continuous spectrum if there are no eigenvectors. If 
H is the closure of the linear span of the 
eigenvectors, then we say that the operator U has 
pure point spectrum. The weak-mixing property of 
(T, 2) just implies that U restricted on the ortho- 
normal subspace of the subspace consisting of 


N—1 


! 1 
Jim g2 


n=0 


254 Ergodic Theory 


constant functions has continuous spectrum. We 
recall that if U has one as a simple eigenvalue then T 
is ergodic. Additionally, if there are no other 
eigenvalues, then T is weakly mixing. Hence, if T 
is weak mixing, then it is necessarily ergodic. The 
next property corresponds to the term “relaxation” 
in physics literature which is used to describe 
processes under which the system passes to a certain 
stationary state independently of its original state. 
We call T (strong) mixing if for any A, B € B 


lim p(T "A N B) = u(A)u(B) 


Then (T, jx) is (strong) mixing if and only if for any 


lim / (fT")gdu = J f du f g dy 
Be X X X 

and mixing is necessarily weak mixing. Moreover, for 
any probability measure v absolutely continuous with 
respect to u, one can show that lim, ,wv(T "A) — 
u(A) for every A € B. Thus, any nonequilibrium 
distribution tends to an equilibrium one with time. 
The mixing property has a significant meaning from 
a physical point of view, as it implies decay of 
correlation of observable functions; moreover, limit- 
ing distributions of averaged observables are deter- 
mined by the decay rates of correlation functions for 
many cases (e.g., hyperbolic systems). For any f € 
L^(X,B,u) we consider the scalar products 
$, —sQ4(f) = (U"f,f),n 20 and define s,:=s_, for 
n « 0. The sequence {s,},,-7 is positive definite and 
so by  Bohners theorem, we can write 
sn(f) = fa exp[2minA] do;(A), where or is a finite 
Borel measure on the unit circle S! and satisfies the 
condition that c,($!) — |f |^. Such a measure is called 
a spectral measure of f. We see that T is mixing iff for 
any f € L'(X,B,u) with fyfdu=0 the Fourier 
coefficients {sn} of the spectral measure o; tend to 
zero as |n| — oo. Let (X,B,j) be isomorphic to 
([0, 1], Bo, A), where Bo is the Borel o-algebra on 
[0, 1] and A is the normalized Lebesgue measure of 
[0, 1]. Then we call a measure-preserving transfor- 
mation T on (X,B,,:) an exact endomorphism if 
(Vo T "B-—(X,0)(u modO). We can verify that an 
exact endomorphism is (strong) mixing (Rohlin 1964). 
Moreover, u is exact if for any positive-measure set 
AEB with T"A € B(Yn > 0)lim, 4, u(T"A) 2 1 
holds. Let T be a nonsingular transformation on 
(X, B,v), that is, vIT ! ~v. Then we can define the 
transfer (Perron-Frobenius) operator £,: L'(X,v) 一 
L'(X,v) by Lyf :=d(fv) T /dv, which satisfies 


J (£,f)gdv = l f(gT)dv (Vg € L*(X,v)) 
AX dX 


We say that a nonsingular measure v is exact if 
Ace oT”B implies v(A)w(A^) 20. By Lin's 
theorem (Lin 1971) the exactness of v can be 
described as follows; Vf € L'(X,v) with f, f dv — 0, 
imes Sy fi, =O. Let p=by be an exact 
T-invariant probability measure equivalent to v. 
Then the upper bounds of mixing rates of the exact 
measure u= bv are determined by the speed of L!- 
convergence of the iterated transfer operators {£,,"}. 
This is because £,b =h and for every f € L'(X,v) 
with f, f dv — 1, lim, .5; || /"f — ||, 2 0. Hence, the 
property £,f =h ! £,(bf) allows one to see that for 
every f,g € L*(X, 1) the correlation function 


Cy. g(2) := [irren ffan f gan 


is bounded from above by 


Ifl. leg — 人 gdull 
= ll Mb Ce" (gb) — P(gh)}ll, 


where P:L'(X,v) — L'(X,v) is a linear operator 
defined by Pf := h |, f dv. The operator P is the one- 
dimensional projection operator associated to the 
eigenvalue 1 (which is maximal in many cases) of £, 
satisfying P? = P and PL, = £,P = P. Moreover, since 
L,” —P=(L, — P)", the exponential decay of mixing 
rates follows from the spectral gap of £,, that is, 1 is 
the simple isolated maximal eigenvalue of Z,. 


Entropy and Reversibility 


We recall one of the fundamental problems of 
ergodic theory, namely deciding when two auto- 
morphisms Tı, T2 of probability spaces (X1, B1, j4) 
and (X5, B5, 4.) are equivalent. The approach devel- 
oped for this problem involved the study of spectral 
properties of the associated isometric operators 
U;:L^(X;,u;) ^ L*(X;, u;)(i— 1,2) and is based on 
the concept of the entropy of automorphism T, 
introduced by Kolmogorov (1958). The entropy is a 
non-negative number, which is the same for equiva- 
lent automorphisms. For example, the entropy of the 
Bernoulli shift o : IL,-z(1,2,...,d] — IL,zz(1,2,..., d] 
with probability vector (pi,p2,...,p4) is equal to 
E y d i De log pz. A remarkable theorem of Ornstein 
(1970) states that Bernoulli shifts with the same 
entropy are equivalent. On the other hand, Shannon 
(1948) introduced a notion of entropy in his work 
information theory, which is essentially the same as 
Kolmogorov's. Let T: X — X be a measure-preserving 
transformation on a probability space (X, B, jj). We 
define the entropy of a measurable partition a of X by 


H,(a)— —S 424 H(A) log (A) and define the entropy 
of T with respect to a by 


= lim -H, (V r a) 


Then the (measure-theoretic) entropy of T is defined by 


h,(T,a) := 


bht) = sup (7a) 


a:H, (a)«oc 


The next Abramov theorem gives an important 
method of practical computation: let (o,],., be an 
increasing sequence of partitions with H,,(a,) < 
oo(Vn > 1) and such that |]),., o, generates the 
c-algebra B. Then b,(T)= lim, ,wo hy(T,an). We 
say that a partition a is called a generator for a 
noninvertible measure-preserving transformation T on 
a probability space (X, B, 1) if V? o T ‘a generates B. 
If T is invertible then a partition a is called a generator 
if Vi _, T'a generates B. In the case when a is a 
generator with H,,(a@) < oc, by the Kolmogorov-Sinai 
theorem we have (1) = p,(T, a). Let oan(x) denote 
an element of V7 ,T-/a containg x € X. By 
the Shatincs-MeMillan-Breiman theorem, if T is a 
measure-preserving transformation of the probability 
space (X, B, 4) and a is a partition of X with H, (a) < 
oc, then —(1/z)u(o,(x)) converges p-a.e. and in 
L'(X, u) as n — oo. If T is ergodic, then the limit 
coincides with h,(T, a). Now we can apply these 
results to piecewise expanding transitive (countable) 
Markov transformations T of X c R^. More specifi- 
cally, let v be the normalized Lebesgue measure of X. It 
is well known that under certain conditions there 
exists the unique ergodic invariant probability measure 
jt equivalent to v. Then we can establish the Rohlin’s 
entropy formula (Rohlin 1964): 


T) = | log |det DTIdn 
x 


under the assumptions that H,(a@)< oo and 
log | det DT| € L'(X,v). In particular, if @ is a finite 
partition and $= —log |det DT| is piecewise Holder 
continuous, then the entropy formula just implies 
that yz is an equilibrium state for the potential @ in 


the following sense: 
D+ | oan- = sup{h,,(T +f o dm|m 


is a T-invariant Borel probability measure on X}, 
where the right-hand side is called the pressure for ġ 
(Walters 1981). 

We now turn our attention to results which relate 
entropy to Lyapunov exponent in the context of smooth 
invertible systems. Let T be a diffeomorphism of a 
compact manifold M. We say that x € M is a regular 
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point of T if there exist numbers A1(x) > A2(x) >- > 
Ag(x) and a decomposition T,M = E(x) + E2(x) 
+--+--+ E4(x) such that 


lim Ž log |DT"(x)u| = (x) 


for every 0 Z u € E;(x) and every 1 <j < d. Let A be 
the set of regular points of T. Then we define a function 


^ du ND 


Aj(x)>0 


) dim E;(x) 


In the case when all Lyapunov exponents at x are 
negative, we put x(x) — 0. Then for every T-invariant 
Borel probability measure u on (X, B), it holds that 
b, (T) € fy x du (Ruelle 1978). Moreover, the equal- 
ity holds whenever T is C'-Holder and ju is absolutely 
continuous with respect to the Lebesgue measure of X 
(Pesin 1977). Let T be a transitive C!-Hélder Anosov 
diffeomorphism. Es, E" denote the stable and unstable 
fiber bundles of T. Suppose that p, is the unique 
T-invariant probability measure which satisfies 


1 21 
/ f du, = lim— > fT*(x) 
M on en 


for every continuous function f : M — R and almost 
everywhere x € M with respect to the Lebesgue 
measure. The probability measure is the so-called 
Sinai- Ruelle-Bowen (SRB) measure. Then we have 


hu (T) = f log | det DTI dn. (x) 


On the other hand, we have 
b. (T) = | log | det DT- (xls ld; (x) 
M x 


+ | log | det DT(x)|dy., (x) 
JM 


We also define unti-SRB measure jj. by replacing T by 
T. Then the SRB measure jz, is absolutely continu- 
ous with respect to the Lebesgue measure of M iff p, 
coincides with the unti-SRB measure p_ (Bowen 
1975). Hence, the SRB measure is absolutely continu- 
ous iff fy log | det DT(x)|dj, (x) — 0. This property is 
sometimes explained as "zero entropy production" 
and also as “reversibility” in the context of non- 
equilibrium statistical mechanics (Ruelle 1997). 


See also: Chaos and Attractors; Determinantal Random 
Fields; Dissipative Dynamical Systems of Infinite 
Dimension; Dynamical Systems and Thermodynamics; 
Finitely Correlated States; Fourier Law; Fractal 
Dimensions in Dynamics; Homeomorphisms and 
Diffeomorphisms of the Circle; Hyperbolic Billiards; 
Hyperbolic dynamical Systems; Intermittency in 


256 Euclidean Field Theory 


Turbulence; Large Deviations in Equilibrium Statistical 
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Introduction 


In this article, we consider Euclidean field theory as 
a formulation of quantum field theory which lives in 
some Euclidean space, and is expressed in probabil- 
istic terms. Methods arising from Euclidean -field 
theory have been introduced in a very successful 
way in the study of concrete models of constructive 
quantum field theory. 

Euclidean field theory was initiated by Schwinger 
(1958) and Nakano (1959), who proposed to study 
the vacuum expectation values of field products 
analytically continued into the Euclidean region 
(Schwinger functions), where the first three (spatial) 
coordinates of a world point are real and the last one 
(time) is purely imaginary (Schwinger points). The 
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possibility of introducing Schwinger functions, and 
their invariance under the Euclidean group are immedi- 
ate consequences of the now classic formulation of 
quantum field theory in terms of vacuum expectation 
values given by Wightman (Streater and Wightman 
1964). The convenience of dealing with the Euclidean 
group, with its positive-definite scalar product, instead 
of the Lorentz group, is evident, and has been exploited 
by several authors; in different contexts. 

The next step was made by Symanzik (1966), who 
realized that Schwinger functions for boson fields 
have a remarkable positivity property, allowing to 
introduce Euclidean fields on their own sake. 
Symanzik also pointed out an analogy between 
Euclidean field theory and classical statistical 
mechanics, at least for some interactions (Symanzik 
1969), 

This analogy was successfully extended, with a 
different interpretation, to all boson interactions by 
Guerra et al. (1975), with the purpose of using 
rigorous results of modern statistical mechanics for 


the study of constructive quantum field theory, 
within the program advocated by Wightman (1967), 
and further pursued by Glimm and Jaffe (see Glimm 
and Jaffe (1981) for an overall presentation). 

The most dramatic advance of Euclidean theory 
was due to Nelson (1973a, b). He was able to isolate 
a crucial property of Euclidean fields (the Markov 
property) and gave a set of conditions for these 
fields, which allow us to derive all properties of 
relativistic quantum fields satisfying Wightman 
axioms. The Nelson theory is very deep and rich in 
new ideas. Even after so many years since the basic 
papers were published, we lack a complete under- 
standing of the radical departure from the conven- 
tional theory afforded by Nelson’s ideas, especially 
about their possible further developments. 

By using the Nelson scheme, in particular a very 
peculiar symmetry property, it was very easy to prove 
(Guerra 1972) the convergence of the ground-state 
energy density, and the van Hove phenomenon in the 
infinite-volume limit for two-dimensional boson 
theories. A subsequent analysis (Guerra et al. 1972) 
gave other properties of the infinite-volume limit of 
the theory, and allowed a remarkable simplification 
in the proof of a very important regularity property 
for fields, previously established by Glimm and Jaffe. 

Since then, all work on constructive quantum field 
theory has exploited in different ways ideas coming 
from Euclidean field theory. Moreover, a very 
important reconstruction theorem has been estab- 
lished by Osterwalder and Schrader (1973), allowing 
a reconstruction of relativistic quantum fields from 
the Euclidean Schwinger functions, and avoiding the 
previously mentioned Nelson reconstruction theorem, 
which is technically more difficult to handle. 

This article is intended to be an introduction to the 
general structure of Euclidean quantum field theory, 
and to some of the applications to constructive 
quantum field theory. Our purpose is to show that, 
50 years after its introduction, the Euclidean theory is 
still interesting, both from the point of view of 
technical applications and physical interpretation. 

The article is organized as follows. In the next 
section, by considering simple systems made of a 
single spinless relativistic particle, we introduce the 
relevant structures in both Euclidean and Minkowski 
worlds. In particular, a kind of (pre)Markoy property 
is introduced already at the one-particle level. 

Next we present a description of the procedure of 
second quantization on the one-particle structure. 
The free Markov field is introduced, and its crucial 
Markov property explained. Following Nelson, we 
use probabilistic concepts and methods, whose 
relevance for constructive quantum field theory 
became immediately more and more apparent. The 
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very structure of classical statistical mechanics for 
Euclidean fields is firmly based on these probabil- 
istic methods. This is followed by an introduction of 
interaction, and we show the connection between 
the Markov theory and the Hamiltonian theory, for 
two-dimensional space-cutoff interacting scalar 
fields. In particular, we present the Feynman-Kac- 
Nelson formula that gives an explicit expression of 
the semigroup generated by the  space-cutoff 
Hamiltonian in ®o« space. We also deal with some 
applications to constructive quantum field theory. 
This is followed by a short discussion about the 
physical interpretation of the theory. In particular, 
we discuss the Osterwalder-Schrader reconstruction 
theorem on Euclidean Schwinger functions, and the 
Nelson reconstruction theorem on Euclidean fields. 
For the sake of completeness, we sketch the main 
ideas of a proposal, advanced in Guerra and 
Ruggiero (1973), according to which the Euclidean 
field theory can be interpreted as a stochastic field 
theory in the physical Minkowski spacetime. 

Our treatment will be as simple as possible, by relying 
on the basic structural properties, and by describing 
methods of presumably very long lasting power. The 
emphasis given to probabilistic methods, and to the 
statistical mechanics analogy, is a result of the historical 
development. Our opinion is that not all possibilities 
of Euclidean field theory have been fully exploited 
yet, both from technical and physical points of view. 


One-Particle Systems 


A system made of only one relativistic scalar 
particle, of mass m > 0, has a quantum state space 
represented by the positive-frequency solutions of 
the Klein-Gordon equation. In momentum space, 
with points p,,/,—0,1,2,3, let us introduce the 
upper mass hyperboloid, characterized by the con- 
straints p? = p? — Y ,pl—m?,po >m, and the 
relativistic invariant measure on it, formally given 
by dalp) — 0(po)ó(p^ — m?) dp, where @ is the step 
function 0(x) — 1 if x » 0, and 0(x) 20 otherwise, 
and dp is the four-dimensional Lebesgue measure. 
The Hilbert space of quantum states F is given by 
the square-integrable functions on the mass hyper- 
boloid equipped with the invariant measure dy(p). 
Since in some reference frame the mass hyperboloid 
is uniquely characterized by the space values of the 
momentum p, with the energy given by po = 
w(p) = Vp? +m, the Hilbert space F of the states 
is, in fact, made of those complex-valued tempered 
distributions f in the configuration space R? whose 
Fourier transforms, f (p), are square-integrable func- 
tions in momentum space with respect to the image 
of the relativistic invariant measure dp/2u(p), where 
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dp is the Lebesgue measure in momentum space. 
The scalar product on F is defined by 


(f. On) [PMR =P, 


where we have normalized the Fourier transform in 
such a way that 


f(x) = J exp(ip.x)f (p) dp 
Kp) = (2x)? | exp(—ip.x)} (x) dx 
| | exp(ip.x) dp = (2x) 6(x) 


The scalar product on F can also be expressed in the 


form 
t.a. || fe 


where we have introduced the two-point Wightman 
function at fixed time, defined by 


W(x’ — x)g(x) dx dx 


/ 一 3 . / dp 

W(x — x) = (2n) | ewe *)) au(p) 

A unitary irreducible representation of the Poincaré 

group can be defined on F in the obvious way. In 

particular, the generators of space translations are 

given by multiplication by the components of p in 

momentum space, and the generator of time transla- 
tions (the energy of the particle) is given by w(p). 

For the scalar product of time-evolved wave 

functions, we can write 


(exp(—i£ )f, exp(—it)g); 
= // f(x) W(t — t,x’ — x)g(x) dx dx 


where we have introduced the two-point Wightman 
function, defined by 


W(t — t, x' — x) 


E J i , dp 

(27) f e» i(t — £')) exp(ip.(x *)) qup) 

To the physical single-particle system living in 

Minkowski spacetime, we associate a kind of 

mathematical image, living in Euclidean space, 

from which all properties of the physical system 

can be easily derived. We start from the two-point 
Schwinger function 


j= 1 exp(ip - x) dp 
(Qx)* J s/p + m? 
which is the analytic continuation of the previously 
given two-point Wightman function into the Schwinger 


points. Here x, p € R^, and p. x = $T. i xipi. Here dp 
and dx are the Lebesgue measures in the R* momentum 
and configuration spaces, respectively. The function 
S(x) is positive and analytic for x 4 0, decreases as 
exp (—71||x||) as x — oo, and satisfies the equation 


(—A + m?)S(x) = 6(x) 


where A= Y , 
dimensions. 

The mathematical image we are looking for is 
described by the Hilbert space N of those tempered 
distributions in four-dimensional configuration space 
R* whose Fourier transforms are square integrable 
with respect to the measure dp/A/ p? + m*. The scalar 
product on N is defined by 


(few = Q f (X0) —1— 


Four-dimensional Fourier transforms are normalized 
as follows: 


0^/Ox? is the Laplacian in four 


f(x) = J exp (ip.x)f(p) dp 
e a | exp (—ip.x)f (x) dx 
f 9» (ip.x) dp = (2x) * 6(x) 


We also write 


ifaw = | | reos 
= (f. No 


where (,) is the ordinary Lebesgue product defined 
on Fourier transforms and, in momentum space, 
(-A+m?)' amounts to a multiplication by 
(p? 十 m?) 5. The Schwinger function S(x — y) is 
formally the kernel of the operator (—A + m?) '. 
The Hilbert space N is the carrier space of a 
unitary (nonirreducible) representation of the four- 
dimensional Euclidean group E(4). In fact, let (a, R) 
be an element of E(4) 


(a, R): R* — R* 


xo Rx+a 


y)g(y) dx dy 


where a € R*, and R is an orthogonal matrix, 


RR! =R'R=14. Then the transformation u(a, R) 
defined by 
u(a,R): N —^N 


f (x) 一 (u(a, R)f)(x) = f(R^' (x — a)) 


provides the representation. In particular, we con- 
sider the reflection ro with respect to the hyperplane 


X4 — 0, and the translations u(t) in the x4-direction. 
Then we have rou(t)ro =u(—t), and analogously for 
other hyperplanes. 

Now we introduce a local structure on N by 
considering, for any closed region A of R^, the 
subspace N4 of N made by distributions in N with 
support on A. We call ea the orthogonal projection 
on N4. It is obvious that if A € B then Na € Ng and 
€A€p —ep€A — eA. A kind of (pre)Markov property 
for one-particle systems is introduced as follows. 
Consider a closed three-dimensional piecewise 
smooth manifold ø, which divides R* in two closed 
regions A and B, having o in common. Therefore, 
c€A,o€ B,AnB-o,AUB-R*. Let Na, Np, N,, 
and e4,ep,€, be the associated subspaces and 
projections, respectively. Then Ns C Na, No C Ng, 
and e,64—eAe6,—e,5,065,€p —epeg =o. It is very 
simple to prove the following: 


Theorem 1 Let e4,ep,e, be defined as above, then 
EAER — EBEA = Cpe 


Clearly, it is enough to show that for any f € N 
we have eaegf € N,. In that case, e;e4epf — eaAepf, 
from which the theorem easily follows. Since eaepf 
has support on A, we must show that for any C 
function g with support on A, we have 
(g,eAepf) =0. Then eaesf has support on o, and 
the proof is complete. Now we have 


— ((—A + m^)g, eaesf) x 
= (eA(—A + m^)g, esf Yy 
=((—A + m^)g, esl) x; 
= (g, epf) =0 


where we have used the definition of (),, in terms of 
(), the fact that e4(—A + m^)g = (— A + m?)g, since 
(—A + m?)g has support on A,, and the fact that egf 
has support on B. This ends the proof of the 
(pre)Markov property for one-particle systems. 

A very important role in the theory is played by 
subspaces of N associated to hyperplanes in R^. To 
fix ideas, consider the hyperplane x4=0 and the 
associated subspace No. A tempered distribution in 
N with — on — 0 has necessarily the form 
(f & 60)(x) = f (x)ó( with f € F. By using the 
basic ad E = x 2 0 and M » 0, 


~*~ exp(ipx) 
L x b? +M? ARPS 


(g, eaepf ) 


Texp (—Mx) 


it is immediate to verify that ||f ® 60lln = ||f || 
Therefore, we have an isomorphic and isometric 
identification of the two Hilbert spaces F and No. 
Obviously, similar considerations hold for any 
hyperplane. In particular, we consider the 
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hyperplanes x4 — t and the associated subspaces N;. 
Let us introduce injection operators j; defined by 


h: F— N 
f >f 8 ê 


where f is a generic element of F, with values f(x) 
and (f @ 6,)(x)=f(x)6(x4 —t). It is immediate to 
verify the following properties for j+ and its adjoint 
j;: the range of j; is Nr; moreover, fi is an isometry, 
sa that fj: = 15, jj; =e:, where 1p is the identity on 
F, and e, is the projection on N;. Moreover, e;j, — j; 
and j; — j;e;. 

If we introduce translations u(t) along the 
x4-direction and the reflection ro with respect to 
x4=0, then we also have the covariance property 
u(t)j;=jr+s, and the reflexivity property rojo = jo, 
Joro —jo. The reflexivity property is very important. 
It tells us that rọ leaves No pointwise invariant, and 
it is an immediate consequence of the fact that 
ó(x4) = ó(—xa). 

Therefore, if we start from N we can obtain F, by 
taking the projection j+ with respect to some 
hyperplane 7, in particular x4 =Q. It is also obvious 
that we can induce on F a representation of E(3) by 
taking those elements of E(4) that leave m invariant. 

Let us now see how we can define the Hamiltonian 
on F starting from the properties of N. Since we are 
considering the simple case of the one-particle system, 
we could just perform the following construction 
explicitly by hand, through a simple application of 
the basic magic formula given earlier. But we prefer 
to follow a route that emphasizes Markov property 
and can be immediately generalized to more 
complicated cases. 

Let us introduce the operator p(t) on F defined by 
the dilation p(t) — jj: —j5u(t)jo,t > 0. Then we 
prove the following: 


Theorem 2 The operator p(t) is bounded and self- 
adjoint. The family {p(t)}, for t>0, is a norm- 
continuous semigroup. 


Proof Boundedness and continuity are obvious. Self- 
adjointness is a consequence of reflexivity. In fact, 

p (t) = jou(—t)jo = jorou(t)rojo = jou(t)jo = p(t) 
The semigroup property is a consequence of the 
Markov property. In fact, let us introduce 
N+, No, N_ as subspaces of N made by distributions 
with support in the regions x4 > 0,x4=0,x4 < 0, 
respectively, and call e,,e9,e_ the respective projec- 
tions. By Markov property, we have eo = e e,. Now 
write, for s,£ > 0, 


p(t)b(s) = jou(t)jojou(s)jo= jou (t)eou(s)jo 
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If eo could be cancelled, then the semigroup 
property would follow from the group property of 
the translations u(t)u(s)=au(t+s) (a miracle of the 
dilations!). For this, consider the matrix element 


(f.p(t)p(s)g)g = (u(—t)jof . eou(s)jo g)N 


recall eg=e_e,, and use 
u(—t)jof € N_. 


Let us call h the generator of p(t), so that 
p(t)= exp (—th), for t > 0. By definition, h is the 
Hamiltonian of the physical system. A simple 
explicit calculation shows that / is just the energy 
w introduced earlier. Starting from the representa- 
tion of the Euclidean group E(3) already given and 
from the Hamiltonian, we immediately get a 
representation of the full Poincaré group on F. 
Therefore, all physical properties of the one-particle 
system have been reconstructed from its Euclidean 
image on the Hilbert space N. 

As a last remark of this section, let us note that we 
can consider the real Hilbert spaces N, and F,, made of 
real elements (in configuration space) in N and F. The 
operators u(a, t), u(t), ro, jr, j>, €a are all reality preser- 
ving, that is, they map real spaces into real spaces. 

This completes our discussion about the one- 
particle system. For more details we refer to Guerra 
et al. (1975) and Simon (1974). We have introduced 
the Euclidean image, discussed its main properties, 
and shown how we can derive all properties of the 
physical system from its Euclidean image. In the 
next sections, we will show how this kind of 
construction carries through the second-quantized 
case and the interacting case. 


u(sjog € N, and 


Second Quantization and Free Fields 


We begin this section with a short review about the 
procedure of second quantization based on prob- 
abilistic methods, by following mainly Nelson 
(1973b); see also Guerra et al. (1975) and Simon 
(1974). Probabilistic methods are particularly useful 
in the framework of the Euclidean theory. 

Let H be a real Hilbert space with symmetric 
scalar product (,). Let ó(u) be the elements of a 
family of centered Gaussian random variables 
indexed by u € H, uniquely defined by the expecta- 
tion values E(ó(u)) — 0, E(ó(u)ó(v)) = (u,v). Since ó 
is Gaussian, we also have 


E(exp(Aó(u))) = exp GX (u, u)) 
and 


E(ó(u1)ó(u2) ++ ó(us)) = [uua +++ Un] 


Here [...] is the  Hafnian of elements 
[u;u;] = (ui, uj}, defined to be zero for odd n, and 
for even z given by the recursive formula 


n 
[nuo «++ ttn] = X [mui] [uru un) 
i42 


where in [...]|' the terms u and u; are suppressed. 
Hafnians, from the Latin name of Copenhagen, the 
first seat of the theoretical group of CERN, were 
introduced in quantum field theory by Caianiello 
(1973), as a useful tool when dealing with Bose 
statistics. 

Let (O, £, u) be the underlying probability space 
where ¢ are defined as random variables. Here O is 
a compact space, X a o-algebra of subsets of Q, and 
p a regular, countable additive probability measure 
on X, normalized to 4(Q)= Jo du — 1. 

The fields @(u) are represented by measurable 
functions on Q. The probability space is uniquely 
defined, but for trivial isomorphisms, if we assume 
that X is the smallest o-algebra with respect to 
which all fields ó(u), with 4 € H, are measurable. 
Since ó(u) are Gaussian, they are represented by 
L’ (QO, £, u) functions, for any p with 1 € p < oc, 
and the expectations will be given by 


E(ó(u1)ó(u2) - - - ó(us)) 
=| ó(u1)ó(u2) - - - (un) du 
Q 


where, by a mild abuse of notation, ġ(u;) on the 
right-hand side denote the O space functions which 
represent the random variables ó(w;) We call the 
complex Hilbert space F =I'(H)=L?(Q,¥,) the 
ox space constructed on H, and the function Q = 
1 on O the ox vacuum. 

In order to introduce the concept of second 
quantization of operators, we must introduce sub- 
spaces of F with a “fixed number of particles." Call 
Fo) ={AQo}, where 入 is any complex number. 
Define 大 (< as the subspace of F generated by 
complex linear combinations of monomials of the 
type $(u1):::ó(uj), with u; € H, and j <n. Then 
F (<n-1) is a subspace of fF(<n). We define Finj, the 
n-particle subspace, as the orthogonal complement 
of F Ent) in F (en), so that 


F (<n) = F (n) D F (<n-1) 


By construction, the Fin) are orthogonal, and it is 
not difficult to verify that 


F = 由 Fn 


n=O 


Let us now introduce the Wick normal products 
by the definition 


:(u1)0(u2) - - - (us) = E(uo(u1)0(u2) -- - d(Un) 


where E;n) is the projection on Fin). It is not difficult 
to prove the usual Wick theorem (see, e.g., Guerra 
et al. (1975), and its inversion given by Caianiello 
(1973). 

It is interesting to remark that, in the framework 
of the second quantization performed with prob- 
abilistic methods, it is not necessary to introduce 
creation and destruction operators as in the usual 
treatment. However, the two procedures are com- 
pletely equivalent, as shown, for example, in Simon 
(1974). 

Given an operator A from the real Hilbert space 
Hı to the real Hilbert space H2, we define its 
second-quantized operator T (A) through the follow- 
ing definitions: 


P(A)Qo1 = Q02 
r(A) 1 (1 oy (12) T D1 (ttn): 
= :$2(Au1)Q2 (Au2) -- - p2 (Au, ): 


where we have introduced the probability spaces Q1 
and QO», their vacua Qo; and Qo2, and the random 
variables @; and 2, associated to Hı and Ho, 
respectively. The following remarkable theorem by 
Nelson (1973b) gives a full characterization of I'(A), 
very useful in the applications. 


Theorem 3 Let A be a contraction from the real 
Hilbert space Hı to the real Hilbert space Hz. Then 
[(A) is an operator from Li, to Lb, which is 
positivity preserving, V (A)u > 0 if u > 0, and such 
that PT Ue )= E(u). Moreover, (A) is a contrac- 
tion from L? (1) £O L^» for any p, 1 € / « oc. Finally, 
r(A) is also a eta T D to Lo, with 


q > p, if (AI < (p — D/\q 


We have indicated with ne (3, the L^ spaces 
associated to Hı and H2, respectively. This is the 
celebrated best hypercontractive estimate given by 
Nelson. For the proof, we refer to the original paper 
of Nelson (1973b); see also Simon (1974). 

This completes our short review on the theory of 
second quantization based on probabilistic methods. 

The usual time-zero quantum field $(u), u € F,, in 
the ox representation, can be obtained through 
"x rum starting from F,. We call 
(OQ, X, fa underlying probability space, and 
JE LE = L*(Q,X,j) the Hilbert ox space of 
the free "ebat particles. 

Now we introduce the free Markov field (f), f € 
N,, by taking N, as the starting point. We call 
(O,X,u) the associated probability space. We 
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introduce the Hilbert space N=I(N,)= 
L^(Q, X, u), and the operators U(a,R)=T(u(a, R)), 
Ro =T (ro), U(t) = l'(u(t)), Ea —T'(eA), and so on, for 
which the previous Nelson theorem holds (take 
Hi = Ho -N,). 

Since in general [(AB)=I(A)I(B), we have 
immediately the following expression of the Markov 
property E; — EAEg, where the closed regions 
A,B,o of the Euclidean space have the same 
properties as explained earlier in the proof of the 
(pre)Markov property for one-particle systems. 

It is obvious that E, can also be understood as 
conditional expectation with respect to the sub-o- 
algebra X4 generated by the field ¢(f) with f € N, 
and the support of f on A. 

The relation, previously pointed out, between N; 
subspaces and F are also valid for their real parts 
N,, and F,. Therefore, they carry out through the 
second quantization procedure. We introduce 
J; 5 T (jj) and J? 2T (77); then the following proper- 
ties hold. J; is an isometric injection of L?(O,», ji) 
into L’ (Q, X, u); the range of J; as an operator L? — I? 
is obviously Fe t= T(N,); moreover, J;J? = E, The free 
Hamiltonian Hp is given for t > 0 by 


Jo]: = exp(—tHo) = l'(exp(—tw)) 
Moreover, we have the covariance property 
U(t)Jo — Ji, and the reflexivity RoJo = Jo, Jo Ro = Jp. 

These relations allow a very simple expression for 
the matrix elements of the Hamiltonian semigroup 
in terms of Markov quantities. In fact, for u,v € F 
we have 


(u, exp(—tHo)v) = J, (J) Jov dn 


In the next section, we will generalize this 
representation to the interacting case. 

Finally, let us derive the hypercontractive property 
of the free Hamiltonian semigroup. 

Since || exp (—tw)|| < exp(—tm), where m is the 
mass of the particle, we have immediately, by a 
simple application of Nelson theorem, 


| exp(—tHo)ll,, < 


provided q — 1 € (p — 1)exp(2tm), where ||.. 2 ó 
denotes the norm of an operator from L? to L4 
spaces. 


Interacting Fields 


The discussion of the previous sections was limited 
to free fields both in Minkowski and Euclidean 
spaces. Now we must introduce interaction in order 
to get nontrivial theories. 
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First, as a general motivation, we will proceed 
quite formally and then we will resort to precise 
statements. 

Let us recall that in standard quantum field theory, 
for scalar self-coupled fields, the time-ordered pro- 
ducts of quantum fields in Minkowski spacetime can 
be expressed formally through the formula 


(T exp(i f £dx)) 


where T denotes time ordering, ó are free fields in 
Minkowski spacetime, £ is the interaction Lagrangian, 
and (...) are vacuum averages. As is well known, this 
expression can be put, for example, at the basis of 
perturbative expansions, giving rise to terms expressed 
through Feynman graphs. The appropriately chosen 
normalization provides automatic cancelation of the 
vacuum to vacuum graphs. 

Now we can introduce a formal analytic continua- 
tion to the Schwinger points, as previously done for 
the one-particle system, and obtain the following 
expression for the analytic continuation of the field 


time-ordered products, now called Schwinger 
functions, 
S(x1, * RE = (9(x1) gx exp U) 


(exp U) 


Here x1,...,x, denote points in Euclidean space, ó 
are the Euclidean fields introduced earlier. The 
chronological time ordering disappears, because the 
fields ó are commutative, and there is no distin- 
guished “time” direction in Euclidean space. Here 
the symbol (...) denotes the expectation values 
represented by [...dj, as explained earlier, and U 
is the Euclidean *action" of the system formally 
given by the integral on Euclidean space 


U=- | P(6(x)) dx 


if the field self-interaction is produced by the 
polynomial P. 

Therefore, these formal considerations suggest 
that the passage from the free Euclidean theory to 
the fully interacting one is obtained through a 
change of the free probability measure dy to the 
interacting measure 


exp U dy/ f exp U du 
JỌ 


The analogy with classical statistical mechanics is 
evident. The expression exp U acts as Boltzmannfaktor, 
and Z = fo exp U dy is the partition function. 

Our task will be to make these statements precise 
from a mathematical point of view. We will be 


obliged to introduce cutoffs, and then be involved in 
their careful removal. 

For the sake of convenience, we make the 
substantial simplification of considering only two- 
dimensional theories (one space, one time dimension 
in the Minkowski region) for which the well-known 
ultraviolet problem of quantum field theory gives no 
trouble. There is no difficulty in translating the 
contents of the previous sections to the two- 
dimensional case. 

Let P be a real polynomial, bounded below and 
normalized to P(0)=0. We introduce approxima- 
tions h to the Dirac 6 function at the origin of the 
two-dimensional Euclidean space R*, with þh € N,. 
Let hy be the translate of h by x, with x € R?. The 
introduction of b, equivalent to some ultraviolet 
cutoff, is necessary, because local fields, of the 
formal type ó(x), have no rigorous meaning, and 
some smearing is necessary. 

For some compact region A in R?, acting as space 
cutoff (infrared cutoff), introduce the O space 
function 


where dx is the Lebesgue measure in R?. It is 
immediate to verify that Um is well defined, 
bounded below and belongs to L’ (Q, X, 1), for any 
p, 1<p<oo. This is the infrared and ultraviolet 
cutoff action. Notice the presence of the Wick 
normal products in its definition. They provide a 
kind of automatic introduction of counterterms, in 
the framework of renormalization theory. 

The following theorem allows us to remove the 
ultraviolet cutoff. 


Theorem 4 Let b — 6, in the sense that the Fourier 
transforms b are uniformly bounded and converge 
pointwise in momentum space to tbe Fourier trans- 
form of the 6-function given by (23) ?. Then Ut ) is 
L?-convergent for any p,1 € p < oo, as b — 6. Call 
Ua the L?-limit, then Ux, exp Uy € LP" (Q, X4, u), for 
1 € p oo. 


The proof uses standard methods of probability 
theory, and originates from pioneering work of 
Nelson in (1966). It can be found for example in 
Guerra et al. (1975), and Simon (1974). 

Since U, is defined with normal products, and 
the interaction polynomial P is normalized to 
P(0)—0, an elementary application of Jensen 
inequality gives 


| exp Un du > exp. | Uydu =] 
JO Q 


Therefore, we can rigorously define the new space 
cutoff measure in O space: 


dua = exp U, dy/ | exp UA dy 
Q 


The space-cutoff interacting Euclidean theory is 
defined by the same fields on O space, but with a 
change in the measure and, therefore, in the 
expectation values. The correlations for the inter- 
acting fields @ are the cutoff Schwinger functions 


. Xn) = ((x1) n P(xn)) 
= ZA ($(x1) =- p(xn) exp UA) 


where the partition function is 


SA (WX1,.. 


PA = (exp UA) 


We see that the analogy with statistical mechanics 
is complete here. Of course, the introduction of the 
space cutoff A destroys translation invariance. The 
full Euclidean covariant theory must be recovered by 
taking the infinite-volume limit A— R? on field 
correlations. For the removal of the space cutoff, all 
methods of statistical mechanics are available. In 
particular, correlation inequalities of ferromagnetic 
type can be easily exploited, as shown, for example, 
in Guerra et al. (1975) and Simon (1974). 

We would like to conclude this section by giving 
the connection between the space-cutoff Euclidean 
theory and the space-cutoff Hamiltonian theory in 
the physical Pox space. 

For £ > 0,t > 0, consider the rectangle in RŽ, 


l | 
A(L, t) = (ena) 7 < X1 < 7.0 < Wy < r) 


and define the operator in the physical Pox space 


Pe(t) = Jo exp Ua (£, Di 


where Jo and J; are injections relative to the lines 
x2=0 and x2 =t, respectively. Then the following 
theorem, largely due to Nelson, holds. 


Theorem 5 The operator P;(t) is bounded and self- 
adjoint. The family {P)(t)}, for £ fixed and t > 0, 
is a strongly continuous semigroup. Let Hy, be 
its lower bounded self-adjoint generator, so that 
P(t) =exp (—tH,). On the physical Pox space, there 
is a core D for H; such that on D the equality 
H; = Ho + Ve holds, where Ho is the free Hamiltonian 
introduced earlier and V, is the volume-cutoff 
interaction given by 
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where h,, are the translates of approximations to 
the 5-function at the origin on the x4-space, and the 
limit is taken in L?, in analogy to what has been 
explained for the two-dimensional case in the 
definition of Uy. 


While we refer to Guerra et al. (1975) and Simon 
(1974) for a full proof, we mention here that 
boundedness is related to hypercontractivity of the 
free Hamiltonian, self-adjointness is a consequence 
of reflexivity, and the semigroup property follows 
from Markov property. This theorem is remark- 
able, because it expresses the cutoff interacting 
Hamiltonian semigroup in an explicit form in the 
Euclidean theory through probabilistic expectations. 
In fact, we have 


(u,exp(—tH,)v) = I (Jua) Jov exp Un(8,t) dy 


We could call this expression as the Feynman-Kac- 
Nelson formula, in fact it is nothing but a path 
integral expressed in stochastic terms, and adapted to 
the Hamiltonian semigroup. 

By comparison with the analogous formula given 
for the free Hamiltonian semigroup, we see that 
the introduction of the interaction inserts the 
Boltzmannfaktor under the integral. 

As an immediate consequence of the Feynman- 
Kac-Nelson formula, together with Euclidean cov- 
ariance, we have the following astonishing Nelson 
symmetry: 


(Qo, exp(—tHy)Qo) = (Qo, exp(—LH;z)Qo) 


which was at the basis of Guerra (1972) and Guerra 
et al. (1972), and played some role in showing the 
effectiveness of Euclidean methods in constructive 
quantum field theory. 

It is easy to establish, through simple probabilistic 
reasoning, that Hy has a unique ground state Qy of 
lowest energy Ep. For a convenient choice of 
normalization and phase factor, one has ||Q;||; = 1, 
and £2; 0 almost everywhere on O space (for 
bosonic systems, ground states have no nodes in 
configuration space!) Moreover, Q; € L?, for any 
1<p<o. If £»0 and the interaction is not 
trivial, then 02,40 ,E,<0, and- Qll < 1. 
Obviously, ||exp (-£H;)||; 2 = exp (—tE;). 

The general structure of Euclidean field theory, as 
explained in this section, has been at the basis of all 
applications in constructive quantum field theory. 
These applications include the proof of the existence 
of the infinite-volume limit, with the establishment of 
all Wightman axioms, for two- and three-dimensional 
theories. Moreover, the existence of phase transitions 
and symmetry breaking has been firmly established. 
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Extensions have also been given to theories involving 
Fermions, and to gauge field theory. Due to the scope 
of this review, limited to a description of the general 
structure of Euclidean field theory, we cannot give 
a detailed treatment of these applications. Therefore, 
we refer to recent general reviews on constructive 
quantum field theory for a complete description of 
all results (see, e.g., Jaffe (2000)). For recent applica- 
tions of Euclidean field theory to quantum fields on 
curved spacetime manifolds we refer, for example, to 
Schlingemann (1999). 


The Physical Interpretation of Euclidean 
Field Theory 


Euclidean field theory has been considered by most 
researchers as a very useful tool for the study of 
quantum field theory. In particular, it is quite easy, 
for example, to obtain the fully interacting Schwin- 
ger functions in the infinite-volume limit in two- 
dimensional spacetime. At this point, there arises 
the problem of connecting these Schwinger func- 
tions with observable physical quantities in Min- 
kowski spacetime. A very deep result of 
Osterwalder and Schrader (1973) gives a very 
natural interpretation of the resulting limiting 
theory. In fact, the Euclidean theory, as has been 
shown earlier, arises from an analytic continuation 
from the physical Minkowski spacetime to the 
Schwinger points, through a kind of analytic 
continuation in time (also called Wick rotation, 
because Wick exploited this trick in the study of 
the Bethe-Salpeter equation). Therefore, having 
obtained the Schwinger functions for the full 
covariant theory, after all cutoff removal, it is 
very natural to try to reproduce the inverse analytic 
continuation in order to recover the Wightman 
functions in Minkowski spacetime. Therefore, 
Osterwalder and Schrader have been able to 
identify a set of conditions, quite easy to verify, 
wich allow us to recover Wightman functions from 
Schwinger functions. A key role in this reconstruc- 
tion theorem is played by the so-called reflection 
positivity for Schwinger functions, a property quite 
easy to verify. In this way, a fully satisfactory 
solution for the physical interpretation of Euclidean 
field theory is achieved. 

From a historical point of view, an alternate route 
is possible. In fact, at the beginning of the exploita- 
tion of Euclidean methods in constructive quantum 
field theory, Nelson was able to isolate a set of 
axioms for the Euclidean fields (Nelson 1973a), 
allowing the reconstruction of the physical theory. 
Of course, Nelson axioms are more difficult to 


verify, since they also involve properties of the 
Euclidean fields and not only of the Schwinger 
functions. However, it is still very interesting to 
investigate whether the Euclidean fields play only an 
auxiliary role in the construction of the physical 
content of relativistic theories, or if they have a 
more fundamental meaning. 

From a physical point of view, the following 
considerations could also lead to further developments 
along this line. By its very structure, the Euclidean 
theory contains the fixed-time quantum correlations in 
the vacuum. In elementary quantum mechanics, it is 
possible to derive all physical content of the theory 
from the simple knowledge of the ground state wave 
function, including scattering data. Therefore, at least 
in principle, it should be possible to derive all physical 
content of the theory directly from the Euclidean 
theory, without any analytic continuation. 

We conclude this short section on the physical 
interpretation of the Euclidean theory with a mention 
of a quite surprising result (Guerra and Ruggiero 
1973) obtained by submitting classical field theory 
to the procedure of stochastic quantization in the sense 
of Nelson (1985). The procedure of stochastic 
quantization associates a stochastic process to each 
quantum state. In this case, in a fixed reference frame, 
the procedure of stochastic quantization, applied to 
interacting fields, produces, for the ground state, a 
process in the physical spacetime that has the same 
correlations as Euclidean field theory. This opens the 
way to a possible interpretation of Euclidean field 
theory directly in Minkowski spacetime. However, a 
consistent development along this line requires a new 
formulation of representations of the Poincaré group 
in the form of measure-preserving transformations in 
the probability space where the Euclidean fields are 
defined. This difficult task has not been accomplished 
as yet. 
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Introduction 


In this article we present the semigroup approach 
to linear and nonlinear evolution equations in 
general Banach spaces. In the first part we 
introduce the general frame and we explain the 
cornerstones of the widely developed theory of 
linear evolution equations. Besides the classical 
approach to linear evolution equations based on 
Co-semigroups, we also give a brief introduction to 
the more recent theory of maximal regularity. The 
entire linear theory is not only important on its 
own (which we prove by discussing applications to 
the heat equation, Schrödinger equation, wave 
equation, and Maxwell equations) but it is also 
the indispensable basis for the theory of nonlinear 
evolution equation, which we present in the second 
part. 


Linear Evolution Equations 


Let Eo be a Banach space, T > 0, and assume that 
A:={A(t);t € [0,T]} is a family of closed linear 
operators in Eg. By this we mean that, given t € 
[0, T], there is a linear subspace D(A(t)) of Eo and 


linear mapping A(t): D(A(t)) C Eo — Eo such that the 
graph {(x,A(t)x);x € D(A(t))} of A(t) is a closed 
subspace of Ey x Eg. Given a mapping f :[0, T] 一 
Eo and a vector up € Eo, we study the following initial- 
value problem for (A,f,uo): find a function u € 
C'((0,T],Eo) such that  s(t)c D(A(t)) for 
t € (0, T] and 

u (t) = A(t)u(t) -f(t), t€(0,T], u(0)—wuo [1] 
Sometimes we call [1] also the Cauchy problem of 
the linear evolution equation u'(t) — A(t)u(t) + f(t). 
In the following, we will specify different conditions 
on (A, f, 49) which guarantee the well-posedness of 
[1], and we shall discuss several examples of 
equations of type [1] which are relevant in mathe- 
matical physics. 


Autonomous Homogeneous Equations 


As in the case of ordinary differential equations in 
finite-dimensional spaces, it is convenient to con- 
sider first the autonomous version of [1], that is, we 
assume that .A is trivial in the sense that T — oo and 
that A(0) — A(t) for all t > 0. In order to simplify 
our notation, we set A:= A(0). We consider first the 
homogeneous problem 
u'(t) = Au(t), 


t € (0,00), u(0)- uo [2] 
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where uo € Eo is given. The question of the well- 
posedness of [2] is closely tied to the notion of a 
Co-semigroup in Eo. Let Z(Eo) denote the Banach 
space of all bounded linear operators on Eo, 
endowed with the usual operator norm. A one- 
parameter family 7 ={T(t) € £(Eo);t > 0] is called 
“Co-semigroup” in L(Eo) iff 


1. T(0)=idg, (normalization), 

2. T(s--t) - T(s)T(t) for all s t>0 (semigroup 
property), and 

3. lim, ,oT(t)x =x for all x € Eo (strong continuity 
at 0). 


Given a Co-semigroup 7, we define its (infinite- 
simal) generator B by setting 


dom(B) :— ls € Eo; fy ee exists in Eol 
t 


and by defining 
T(t)x —x 
t 


Bx = lim for x € dom(B) 

te 
This clearly defines a linear operator in Eo and it is 
well known that B is closed and densely defined. 
Moreover, we have 


Theorem 1 Assume that A: D(A) C Eo — Eo is the 
generator of a Co-semigroup {T(t);t > 0}. Then, given 
uy € D(A), problem |2] possesses a unique solution u 
in C! ([0, oc), Eo), which is given by u(t) = T(t)uo. 


Under suitable additional assumptions it can be 
shown that the converse of Theorem 1 also holds 
true. However, we shall not go into these details but 
we prefer to present the following characterization 
of generators of Co-semigroups: 


Theorem 2 (Hille-Yosida). The operator A: D(A) 
C Eo — Eo generates a Co-semigroup iff it is closed, 
densely defined, and there exists w M ER such that 
the resolvent set p(A) of A contains the ray (w, oo) and 
such that |(X — w)"(A— A) "|| € M forallX > w and 
allneN. 


In applications, it is in general rather difficult to 
derive a uniform estimate of powers of the resolvent 
of an unbounded operator. Luckily, generators of 
Co-semigroups of contractions (i.e., ||T(t)||p(p,) < 1 
for all t > 0) can be characterized in a rather useful 
way. To formulate this result we call an operator 
B:D(B) C Eg — Eo “dissipative” iff for any x€ 
D(B) there is an x’ € Ej with (x', x) = lll, = EAI 
such that Re(x’,Bx) < 0. Here (-,-) denotes the 
duality pairing between Ej and Eo. The operator B 
is called “sm-dissipative” if it is dissipative and 
im(Ao — A) = Eo for some Xo > 0. 


Theorem 3 (Lumer-Phillips). Let A: D(A) C Eo 一 
Eo be a closed and densely defined operator. Then 
A generates a Co-semigroup of contractions in C(Eo) iff 
A is m-dissipative. 


Before we shall discuss examples of Co-semigroups 
and their infinitesimal generators, let us introduce the 
following definition: given o € (0, 7], let X4 := {z € C; 
larg (z)) < a} denote the sector in C of angle 2a. A 
family of operators T —(T(z) € £C(Eg);z € Xa} is 
called a “holomorphic Co-semigroup" in £(Eo) iff 


1. [z 9 T(z)] : X4 — £(Eo) is holomorphic, 
2. T(0) —idg, and lim, .9T(z)x =x for all x € Eo, and 
3. T(w+z)=T(w)T(z) for all w,z € Ea. 


Generators of holomorphic Co-semigroups can be 
characterized in the following way: 


Theorem 4 A densely defined closed linear operator 
A:D(A)C Eo > Eo generates a holomorphic 
Co-semigroup iff there exist M > 0 and wo > 0 such 
that A € p(A) and ||XA — AY! | € M for all X € C 
with Re À > wo. 


Examples 5 


(i) Self-adjoint generators. Let Eo be a Hilbert 
space and assume that A is self-adjoint and that 
there exists an ag ER such that A < og. Then 
A generates a holomorphic Co-semigroup {T(t);t > 
0). If {E4(A);A € R} denotes the spectral resolution 
of A, then T(t)= f, exp (tA) dEA(A) for t > 0. 

(ii) Dissipative operators in Hilbert spaces. 
Assume again that Eo is a Hilbert space. Then, 
by Riesz’ representation formula, an operator A is 
dissipative iff Re(u|Au) € 0 for all 4 € D(A). 

(ui) The beat semigroup. Let M be either a 
smooth compact closed Riemannian manifold or 
R” with the Euclidean metric and write A for the 
Laplace-Beltrami operator on M. Then it is known 
that A € £C(D'(M)), where D'(M) is the space of all 
distributions on M. Given 1 € p < oc, let 


D(A,) := {u € Lj (M); Au € Lj (M)] 


and set A,u = Au for u € D(A,). Then A, generates a 
holomorphic Co-semigroup on L,(M), the so-called 
“diffusion” or “heat semigroup” on M. If 1 < p < œ, 
then it can be shown that D(A,)= W7(M ), where 
Wi (M ) denotes the Sobolev space of order k € N, built 
over Lp(M). 

If M — R" then the operators T(t) of the semigroup 
generated by Ag» are given by 


œ i -e-i 
T(t)u(x) = re I exp (= uo dy 


for all t > 0 and almost all x € R”. 


Observe that the case L4.(M) is excluded here. In 
fact, it is known that if a linear operator A generates 
a Co-semigroup on L,(M), then A must be 
bounded. However, it can be shown that suitable 
realizations of the Laplace-Beltrami operator on 
spaces of continuous and Holder continuous func- 
tions generate holomorphic semigroups. For more 
details on that topic the reader is referred to the 
“Further reading" section. 

(iv) Stone's theorem and the Schrödinger equa- 
tion. Let Eo be a Hilbert space and assume that A 
is self-adjoint. Then Theorem 3 and Remark (ii) 
imply that iA generates a Co-group {U(t);t € R} of 
unitary operators. In fact, Stone's theorem ensures 
that every generator of a Co-group of unitary 
operators is of the form iA with a self-adjoint 
operator A. As an example of particular interest, 
let us consider the Schrödinger equation 


1 Ou 
* = Au — Vu [3] 
with a bounded potential V:R" — R. Letting 
D(A):=H?(R”) and Au:=Au— Vu, it follows 
that A is self-adjoint in L;(R"). Hence, the evolution 
of [3] is governed by the group of unitary operators 
generated by iA. Of course, the assumption that V be 
bounded is rather restrictive. In fact, there are 
numerous contributions which show that this assump- 
tion can be weakened considerably. Again reader is 
referred to the “Further reading" section for more 
details in this direction. 
(v) Tbe wave equation. Let us consider the 
following initial-value problem 


Du(t,x)=0, xeR", 1250 
u(0,x) = v1(x), Ou/Ot(0, x) [4] 
= p2(x) XE R” 


for the d'Alembert operator O = 07u/0t7 — Ag» in 
m+ 1 dimensions. In order to associate with [4] a 
semigroup, let us formally-re-express [4] as the 
following first-order system: 

dU 


—— = AU, 


tS 0, 
dt s 


U(0) = o 
where 
0 id 
UG) A= (A o) *- en 

Letting now Eo :— H'(R") x L;(R") and 
D(A) := H^(R") x H'(R"), it can be shown that A 
generates a Co-group of linear operators in L(Eo). 
Hence, given any initial datum (%1, %2) € H?(R")x 
H'(R"), there exists a unique solution z € C! 
([0, 06), L2(IR")) to the initial-value problem [4]. It 
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can be shown that this solution possesses the 
following additional regularity: 


u € C*([0, oo), L; (R")) n C([0, oo), H^ (R")) 


Hence, eqns [4] are satisfied for all £ € [0,00) 
and for almost all x € R”. 

(vi) Maxwell equations. Let E and H denote the 
electric and magnetic field vector, respectively, € and ju 
the electrical permittivity and magnetic permeability, 
respectively, and consider the initial-value problem for 
Maxwell equations in vacuum and without charges 
and currents: given sufficiently smooth vector fields 
(Eo, Ho) find a pair (E, H) such that 


c — rotH = 0 in (0,00) x R? 
A rotE=0 in (0,00) x R? [5] 
E(0,-) = Eg, H(0,-)=Ho in R? 


We assume that £ and u belong to L4,(R?, Leym(R°)) 

and are uniformly positive definite, that is, we 

assume that there are so > 0 and jig > 0 such that 
(e(x)yly) > coly, (n(x)yly) > nol 


for all x,y € R^. Based on these assumptions we 
endow the space L2(R°) x L2(R?) with the inner 
product 


((u1, 42)|(v1, v2)) := (emi |vi)r, + (uuz|vz);, 


for (u1, u2), (v1, v2) € L2(IR3) x L2(R3), and call this 
Hilbert space Eo. We further set 


E, := ((u1,u5) € Eo; (rotuj, rot?) € Eo] 
Finally, given u = (41,12) € E1, let 
Au :一 (c! rot 4», Jit rot) 


It can be shown that iA is self-adjoint in Eo. 
Hence, Stone’s theorem ensures that A generates a 
Co-group of unitary operators in L(Eo). Therefore, 
given (Eo, Ho) € E1, there exists a unique solution (E(-), 
H(-)) of [5]. For this solution, the energy functional 


E(t) =3 | (eG): + (uH CIBC] dx 
is constant on [0, ox). 


Autonomous Inhomogeneous Equations 


Next, we study problem [1] in the case A(£) — A for 
all ż € [O, T). Throughout this section we assume 
that the following minimal hypotheses 


1. A generates a Co-semigroup in Z(Eo), 
2. f S L4((0, T), Eo), and 
3. uo € Eo 
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are satisfied. Later on we shall discuss several more 
restrictive assumptions on (A,f,uo). A function 
u : |0, T] — Eo is called a “(classical) solution" of 


u (t) = Au(t) + f(t), 


iff 4 € C([0, T], Eo) n C! (0, T], Eo), u(t) € D(A) for 
all t € (0, T], and u satisfies [6] pointwise on [0, T]. 
It can be shown that [6] has at most one solution. If 
it has a solution, this solution is represented by the 
following variation-of-constant-formula: 


t€(0,T|, u(0)=u [6| 


u(t) = T(t)uo + 人 T(t—s)f(s)ds, te [0,T] [7] 


where (T(t);t > 0} denotes the semigroup generated 
by A. Observe that the function z:[0, T] — Eo, 
defined by [7], is continuous, but in general not 
differentiable on (0, T]. For this reason one calls [7] 
the “mild solution” of [6]. 

It is not difficult to see that if uo € D(A) and f € 
C'([0, T], Eo), then the mild solution is a classical 
solution, that is, [6] is uniquely solvable in the 
classical sense. In application to nonlinear problems, 
the assumption f € C'([0, T], Eg) is often too 
restrictive. Fortunately, in the case of generators of 
holomorphic semigroups, this assumption on f can 
be weakened in two different directions. Let 
lælla :=|lxllz, + ||Ax||_, denote the graph norm on 
D(A). Then the closedness of A implies that 
(D(A)||*||4) is a Banach space. In the following, 
we call this Banach space E4. Moreover, given o € 
(0,1), we write E,—(Eo,E1), for the complex 
interpolation space between Eo and E;. Then we 
have the following result. 


Theorem 6 Let A generate a holomorphic 
Co-semigroup in L(Eo) and assume that there is a 
constant o € (0,1) such that 


f € C'([0, T], Eo) + C([0, T], Ea) 


Then, given uo € Eo, the Cauchy problem [6] 
possesses a unique classical solution. It is given by 


u(t) = T(tug + | | T(t—s)f(s)ds, t€[0,T| 


where (T(t);t > 0] stands for the semigroup gener- 
ated by A. 


In the following, we discuss an alternative 
approach to the Cauchy problem [6], which is 
based on the so-called theory of maximal regularity. 
There- are several different types of results on 
maximal regularity, which we cannot discuss in full 
detail here. We decided to give a brief introduction 
to the theory of the so-called “maximal L,-regular- 
ity.” For further results on maximal regularity, we 


again draw the reader’s attention to the “Further 
reading” section. 

The Banach space Eo is called an unconditionality 
of martingale “differences” (UMD) space if the 
Hilbert transform is bounded on L,(R, Epo) for 
some g € (1,00). It is known that Hilbert spaces, 
the Lebesgue spaces Ly(X, du) with 1 < p < oo and 
with a o-finite measure space (X,j), and closed 
subspaces of UMD spaces are UMD spaces. 
Furthermore, UMD spaces are without exception 
reflexive. Thus, the spaces Li(X, dy), L4,(X, djs), and 
spaces of continuous or Hólder continuous functions 
are not UMD spaces. 

Next, assume that —A generates a holomorphic 
Co-semigroup in C(Eo) and that [0,00) C p(—A). 
Then, it is known that, given z € C, the fractional 
power A* of A is a densely defined closed operator 
in Eg. We say that A has bounded imaginary powers 
(BIP) of angle 0 > 0 if there exist positive constants 
M and e such that 


A" € £(Eo) and A" leE.) < Mexp(0|t]) 
t € (—&,€) [8] 


In order to have a neat notation, we write A € 
BIP(0) if [8] holds true. 


Remarks 7 In the following, we assume that —A 
generates a holomorphic Co-semigroup in Z(Eo) and 
that [0, 00) C p(—A). 


(i) If Rez < 0, then A? is bounded on Ep. 

(ii) There are several representation formulas for 
the fractional powers of A. Among them we 
picked the following: if Rez € (-1,1) and x € 
D(A), then 


Ax = 2 | s*(s -- A) ^ Ax ds 
0 


TZ 


x 


Assume that Eo is a Hilbert space, that A is self- 
adjoint, and that there is a positive constant a 
such that A > a. Further, let (E4(A) € R} be the 
spectral resolution of A; then 


(iii 


4t | XdEA(À, zEC 
0 


Moreover, A € BIP(0). 
(iv) Let again Eo be a Hilbert space and assume that 


—A is m-dissipative and satisfies 0 € p(A). Then 
A € BIP(7/2). 


Given p € (1,00), Sobolev's embedding theorem 
ensures that W5((0, T), Eo) is continuously injected 
into C([0, T], Eo). Consequently, given any function 
u € wi ((0, T),Eo) and f£€[0,T], the pointwise 
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evaluation u(t) is well defined. In particular, the 
trace at 0 with respect to time 


tr: W,((0,T),Eo) —Eo, we u(0) 


is a well-defined and bounded linear operator. In order 
to formulate the next result, let Es p = (Eo, E1)s,p, with 
p € (1,oc) and s € (0, 1), denote the real interpolation 
space between the basic space Eo and E4, the domain 
D(A) of A, endowed with the graph norm. Further- 
more, we set 


Eo := Lp((0, T), Eo) 
E; := Lp((0, T), E1) NW; (0, T), Eo) 


and we write Isom(E, F) for the set of all topological 
isomorphisms mapping the Banach space E onto the 
Banach space F. 


Theorem 8 (Dore and Venni). Suppose that Eo is a 
UMD space and that Ac BIP(0) for some 
0 € [0, 7/2). Then, given p € (1,00), we have 


(0; + A, tr) c Isom(E;, Eo X E4 3/p,p) 


This means that, given (f,uo) € L5((0, T), Eo) x 
Fi-1/pp, there exists a unique solution u€ 
L,((0, T), E1) n W7((0, T), Eo) of tbe Cauchy problem 
[6]. Moreover, u depends continuously on (f,uo) 
and fulfills the following a priori estimate: 


lule, < elf lle, + lluolle, ,,,) 


-1 
where C :一 || (O, T A, tr) I c(t x Ey. t/p,ps E1) 


Nonautonomous Equations of Hyperbolic Type 


According to Theorem 1 and the corresponding 
remark, it is reasonable to impose in the study of the 
Cauchy problem [1] the minimal hypothesis that, 
given s € [0, T], each individual operator A(s) be the 
generator of a Co-semigroup {T,(t);t > 0} in £(Eo). 
If this semigroup is holomorphic, we call [1] of 
“parabolic type.” Otherwise the evolution equation [1] 
is said to be of “hyperbolic type.” 

A family [A(t;t€[0, T] of generators of 
Co-semigroups in Z(Eo) is called “stable” iff there 
exist positive constants M and w such that (w, co) C 
p(A(t)) for all € [0, T] and such that 


k 


a- ae 


j-1 


« M(\ —w)* for A» uw 


and every finite sequence 0<t; Et? €---t,«T 
with k € IN. Observe that the resolvent operators 
(入 一 A(t;)) ! do not commute in general. Therefore, 
the order of the terms on the left-hand side of the above 
estimate has to be obeyed. Assume that A={A(t);t € 
[0, T]} is a family of z-dissipative operators. Then, A 


is stable, since any m-dissipative operator B satisfies 
the estimate ||(A — B) | < 1/A for all \ > 0. 

It turns out that the stability of a family of 
generators is not sufficient to construct a solution of 
[1] even in the case f = 0. We also need a certain 
time regularity of the mapping £ — A(t). For this we 
say that the family (A(t); € [0, T]] has a common 
domain D iff D is a dense subspace of Eo such that 
D(A(t)) =D for all t € [O, T]. The family {A(t);t € 
[0, T]] is called *strongly differentiable" iff it has a 
common domain D and, given v € D, the function 
t — A(t)v belongs to C'([0, T], Eo). 

We are now prepared to formulate the following 
result. 


Theorem 9 (Kato). Let (A(t);t € [0, T]} be a stable 
and strongly differentiable family of generators of 
Co-semigroups with common domain D. If f € 
Cl([0, T], Eo) and uo € D then [1] possesses a 
unique classical solution. 


The above result is based on the construction of 
an evolution operator U(t,s), which can be 
considered as the generalization of the notion of a 
Co-semigroup for autonomous equations to the case 
evolution equations of the form 


u'(t) = A(t)u(t), t€ (s, T], 


for fixed s € [0, T). Once an evolution operator is 
available, the solution of [1] is given by 


u(s) —v 


u(t) = U(t, 0)uo + [ U(t,s)f(s) ds, t€ [0, T] 


Of course, this generalizes [7] and if A(t) is 
independent of t, then U(t,s)— T(t —s), where 
(T(t);t 7 0] is the semigroup generated by A(0). 

Furthermore, there are several extensions of the 
Kato's result. Among them the most interesting 
contributions are concerned to weaken the time 
regularity of f and to weaken the assumption that 
{A(t);¢ € [0, T]} be strongly differentiable. In parti- 
cular, it is possible to study [1] for families without 
a common domain. 

For the construction of evolution operators as 
well as generalizations of Theorem 9, the reader is 
again referred to the “Further reading” section. 


Nonautonomous Equations of Parabolic Type 


Throughout this section we assume that Eo and E; 
are Banach spaces such that E; is dense and 
continuously injected in Eo. In the study of parabolic 
evolution equations, the class of all operators in 
C (E, Eo), considered as unbounded operators in Eo 
with common domain E;, which generate holo- 
morphic Co-semigroups in L(Eo) has turned out to 
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be very useful. In the following, we call this class 
H(E,, Eo). It is known that A € (Ei, Eo) iff there 
exist constants w > 0 and « > 1 such that w — A € 
Isom(E;, Eo) and such that 


-1 < (A — A)xllo " 
~ |Alllxllo + [ella ~ 
x €E;\{0}, ReA>w 


K 


where || - ||; denotes the norm of E;. Using the above 
characterization, it can be shown that (E, Eo) is 
an open subset of Z(E;, Eo). In the following, we 
always endow (Ej, Eo) with the topology induced 
by the norm of £(E;, Eo). As a consequence of this 
convention it is meaningful to consider, for example, 
continuous mappings from [0, T] into (Ei, Eo). 
Observe that if A € C([O,T],M(Ej,Eg)), then 
A={A(t);t € [0, T] is a family of generators of 
holomorphic semigroups with the common domain 
E1. Then we have the following result. 


Theorem 10 (Sobolevskii, Tanabe). 
there is a p € (0,1) such that 


(A, f) € C’ ([0, T], H(E1, Eo) x Eo) 


Then, given uy € Eo, the Cauchy problem |1] 
possesses a unique classical solution u. This solution 
has tbe additional regularity 


uc C?((0, T], E1) N G'*((0, T], Eo) 
Finally, if uo € Ei, then u € C'([0, T], Eo). 


Assume that 


As in the hyperbolic case, the proof of Theorem 10 
is based on the evolution operator U(t,s) for the 
homogeneous problem, although the constructions of 
the corresponding evolution operators are completely 
different. 

In addition, there are several extensions and 
generalizations of Theorem 10. In particular, the 
assumption that the family {A(t);t € [0,T]} pos- 
sesses a common domain can be weakened con- 
siderably. Furthermore, it is possible to look at 
parabolic evolution equations in the so-called inter- 
polation and extrapolation scales. This offers a great 
flexibility in the study of nonlinear problems. 
Further details in this direction can be found in the 
*Further reading" section. f 


Nonlinear Evolution Equations 


Let Eo,E1 be Banach spaces such that E; is 
dense and continuously embedded in Ep. Assume 
further that uo € E; and that we are given a 
nonlinear operator F € C([0, T] x V, Eo), where V 
is an open neighborhood of uo in Ei. In this 
section, we will discuss the well-posedness of the 


Cauchy problem for the following nonlinear 
evolution equation 


u'(t)=F(t,u(t)), t€(0,T] u(0)=u [9] 


in the Banach space Ep. We will always assume 
that the nonlinear operator F either carries a quasi- 
linear structure or is of fully nonlinear parabolic type. 
By a “quasilinear structure," we mean that there is 
mapping A € C([0, T] x V, £(E;, Eg)) and a suitable 
*lower-order term" f € C([0, T] x V, Eo) such that 


F(t,v) = A(t,v)v + f (t, v) 
for all (t,v) € [0, T] x V 


Problem [9] is of fully nonlinear parabolic type if 
F € C! ([0, T] x V,Eo) and if the Fréchet derivative 
D5F(0, uo) of F with respect to v at (0, uo) belongs to 
the class H(E,, Eo). 


Quasilinear Evolution Equations of Hyperbolic Type 


Assume that Fo is a reflexive Banach space and let 
uy E V C E; be chosen as above. We consider the 
following abstract quasilinear evolution equation of 
hyperbolic type: 


u(t) = A(t,u(t))u(t) + f(t, u(t)), t € (0, T] 


u(0) = Uo H0 


and assume that the following hypotheses are 
satisfied: 


(Hi) A € C([O, T] x V, Z(E;,,Eo)) is bounded on 
bounded subsets of V and, given (t,v)€ 
[0, T] x V, the operator A(t, v) is m-dissipative 
and there is a constant /HA such that 


A(t, v) — A(t, w lec, Eo) < uallv — Wr. 


for all t € [0, T] and all v,w € V. 

(Hə) There is a QeIsom(E;, Eo) such that 
OA(t,v)O ! — A(t, v) + B(t,v), where B(t,v) € 
L(Eo) is bounded, uniformly on bounded 
subsets of V. Moreover, 


|B(t, v) — B(t,w)leg, S Melly — wile, 


for all t € [0, T] and all v,w € V. 
(H3) f € C([0, T] x V, Ei) is bounded on bounded 
subsets of V and there are jj; and p such that 


llf (£, v) — f (t wl < pillv — wille 
for all v,w € V,j € {0,1} 
Then we have the following result. 


Theorem 11 (Kato). Assume that (Hi), (H2), 
and (H3) are satisfied. Then there is a maximal 


t* € (0, T], depending only on |luoll,, and a unique 
solution u to|10] such that 


u=u(-,uo) € C([0,£*), V) N C! ([0, t"), Eo) 


Moreover, tbe mapping ug — ul(: ,uo) is continuous 
from V to C([0,2*), V) n C ([0, t*), Eo). 


There are many applications of Theorem 11 to 
different concrete partial differential equations 
(PDEs), including symmetric hyperbolic first- 
order systems, the Korteweg-de Vries equation, 
nonlinear elastodynamics, quasilinear wave equa- 
tions, Navier-Stokes and Euler equations, and 
coupled Maxwell-Dirac equations. We decided 
to explain in some detail an application to the 
so-called periodic Camassa-Holm equation: 


Uy 一 Uxxt + 3uu, = 2UxUxx + MU xxx 
t>0,xeS! [11] 
where S! stands for the unit circle. In the above 
model, the function u is the height of a unilinear 
water wave over a flat bottom. 
Set X :— L;(S!),V :- H'(S!), and Q:- (I — EY. 
With y:— 4 — uxx, eqn [11] can be re-expressed as 
yr +(Q)yx = -2y(Q ?y), in Li (S!) 


which is of type [10] with 


A(y) = (QP); f(y) = —2x(Q ^y), 
where dom(A(y)) :— (v € L2(S'); (Q?y)v € H'(S!)). 


yeV 


Quasilinear Evolution Equations of Parabolic Type 


Assume that Eg and E; are Banach spaces such that 
E, is dense and continuously injected in Ey. More- 
over, let (-, :)o for each 0 € (0,1) be an admissible 
interpolation functor (e.g., the real or complex 
interpolation functor) and set Ej; :— (Eo, E1)ọ for 0 € 
(0, 1). Given a subset X C Eg for some 0 € (0, 1), we 
set X,:— X E, for n € [0,1], equipped with the 
topology induced by E,. Finally, we write C} (M, N) 
for the class of all locally Lipschitz continuous 
functions mapping the metric space M into the 
metric space N. 


Theorem 12 (Amann). Suppose that 0< y< 8 « 
a < 1, that Xa is open in Es, and that 


(A,f) € C+- ([0, T] x Xa, H(E1, Eo) x Ey) 


Then, given uo € Xa, there exists a unique maximal 
tt €(0,T], such that the quasilinear parabolic 
Cauchy problem 


u (t) = A(t,u(t))u(t)+f(t,u(t)), te (0,T], u(0) = uo 
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possesses a unique classical solution 
B i= u(-, uo) € C([0, E. A N C ((0, £^), Eo) 


Assume A and f are independent of t and let u(- , uo) be 
the solution to corresponding autonomous problem 


u'(t) = A(u(t))u(t) + f(u(t)), 


Then the mapping (t,uo)— u(t,uo) is a semiflow 
on X,. 


t € (0,00), u(0) = uo 


Due to its clarity and flexibility, Theorem 12 has 
found a plethora of applications, which we cannot 
discuss in detail here. Let us at least mention the 
following: reaction-diffusion systems, population 
dynamics, phase transition models, flows through 
porous media, Stefan problems, and nonlinear and 
dynamic boundary conditions in boundary-value 
problems. In addition, many geometric evolution 
equations fall into the scope of Theorem 12. Consider, 
for example, the volume-preserving gradient flow of 
the area functional of a compact hypersurface M in 
R”*! with respect to L2(M) and W,;'(M), respectively. 
These flows are known as the averaged mean curvature 
flow and the surface diffusion flow, respectively, and 
have been investigated on the basis of Theorem 12. 


Fully Nonlinear Evolution Equations 
of Parabolic Type 


Based on the theory of maximal regularity for linear 
evolution equations, it is possible to investigate 
abstract fully nonlinear parabolic problems of type 
[9]. As there are different techniques of maximal 
regularity, there are also different approaches to [9]. 
We present here a result which uses maximal 
regularity properties in singular Holder spaces C. 
Let Eo and E, be Banach spaces such that E, is 
continuously embedded into Eo (density of E, in Eo 
is not needed here). As before, V is an open subset of 
El and D;F stands for the Fréchet derivative of 
F(t,v) with respect to the second variable. 


Theorem 13 (Lunardi). Assume that Fe C 
([0, T] x V, Eo) ‘such that DF e C'([0, T] x 
V, H(E1, Eg)). Then, given uo € V, there is a max- 
imal t+ € (0, T] such that problem [9] bas a solution 
u € C([0,7*), E1) A C'([0, t+), Eo). This solution is 
unique in tbe class 


LJ C$((0,£* — e], E1) n C([0, £* — e], E1) 
08-1 
for each & € (0,t*). 


Theorem 13 has important applications to problems 
for which the hypotheses of Theorem 12 (in particular 
the assumption on the quasilinear structure) are not 
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satisfied. We mention here fully nonlinear second-order 
boundary-value problems, Hele-Shaw models, models 
from combustion theory, and Bellman equations. 


See also: Boltzmann Equation (Classical and Quantum); 
Breaking Water Waves; Dissipative Dynamical Systems 
of Infinite Dimension; Elliptic Differential Equations: 
Linear Theory; Ginzburg—Landau Equation; Image 
Processing: Mathematics; Incompressible Euler 
Equations: Mathematical Theory; Nonlinear Schródinger 
Equations; Partial Differential Equations: Some 
Examples; Quantum Dynamical Semigroups; Relativistic 
Wave Equations Including Higher Spin Fields; Semilinear 
Wave Equations; Separation of Variables for Differential 
Equations; Singularities of the Ricci Flow; Symmetric 
Hyperbolic Systems and Shock Waves; Wave Equations 
and Diffraction. 
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Introduction 


The renormalization group (RG) in its modern form 
was invented by K G Wilson in the context of 
statistical mechanics and Euclidean quantum field 
theory (EQFT). It offers the deepest understanding of 
renormalization in quantum field theory (QFT) by 
connecting EQFT with the the theory of second-order 
phase transition and associated critical phenomena. 
Thermodynamic functions of many statistical mechan- 
ical models (the prototype being the Ising model in two 
or more dimensions) exhibit power-like singularities as 
the temperature approaches a critical value. One of the 
major triumphs of the Wilson RG was the prediction 
of the exponents (known as critical exponents) 
associated to these singularities. Wilson's fundamental 
contribution was to realize that many length scales 
begin to cooperate as one approaches criticality and 
that one should disentangle them and treat them one at 
a time. This leads to an iterative procedure known 
as the “renormalization group.” Singularities and 
critical exponents then arise from a limiting process. 
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Ultraviolet singularities of field theory can also 
be understood in the same way. Wilson reviews this 
(Wilson and Kogut 1974) and gives the historical 
genesis of his ideas (Wilson 1983). 

The early work in the subject was heuristic, in the 
sense that clever but uncontrolled approximations 
were made to the exact equations often with much 
success. Subsequently, authors with mathematical bent 
began to use the underlying ideas to prove theorems. 
Benfatto, Cassandro, Gallavotti, Nicolo, Olivieri et al. 
pioneered the rigorous use of Wilson's renormalization 
group in the construction of super-renormalizable 
QFTs, (see Benfatto and Gallavotti (1995) and 
references therein). The subject saw further mathema- 
tical development in the work of Gawedzki and 
Kupiainen (1984, 1986) and that of Balaban (1982), 
and references therein. Balaban in a series of papers 
ending in Balaban (1989) proved a basic result on the 
continuum limit of Wilson's lattice gauge theory. 
Brydges and Yau (1990) simplified the mathematical 
treatment of the renormalization group for a class of 
models and this has led to further systemization and 
simplification in the work of Brydges et al. (1998, 
2003). Another method which has been intensely 
developed during the same historical period is based on 
phase cell expansions: Feldman, Magnen, Rivasseau, 
and Sénéor developed the early phase cell ideas of 


Glimm and Jaffe and were able to prove independently 
many of the results cited earlier (see Rivasseau (1991) 
and references therein). Although these methods share 
many features of the Wilson RG, they are different in 
methodology and thus remain outside of the purview 
of the present exposition. 

A somewhat different line of development has been 
the use of the RG to give simple proofs of perturbative 
renormalizability of various QFTs: Gallavotti 
and Nicolo, via iterative methods (see Benfatto and 
Gallavotti (1995) and references therein), and 
Polchinski (1984), who exploited a continuous version 
of the RG for which Wilson (1974) had derived a 
nonlinear differential equation. These early works 
were devoted to the standard (ó*)4 scalar field theory, 
but subsequently Polchinski's work has been extended 
to a large class of models, including four-dimensional 
nonabelian gauge theories (see Kopper and Muller 
(2000) and references therein). 

Finally, it should be mentioned that apart from 
QFT and statistical mechanics, the RG method has 
proved fruitful in other domains. An example is the 
study of interacting fermion systems in condensed 
matter physics (see Fermionic Systems and Renor- 
malization: Statistical Mechanics and Condensed 
Matter). In the rest of this article, our focus will be 
on EQFT and statistical mechanics. 


The RG as a Discrete Semigroup 


We will first define a discrete version of the RG and 
consider its continuous version later. As we will see, 
the RG is really a semigroup, so calling it a group is 
a misnomer. 

Let ó be a Gaussian random field (see, e.g., Gelfand 
and Vilenkin (1964) for a discussion of random fields) 
in R^. Associated to it there is a positive-definite 
function which is identified as its covariance. In QFT 
one is interested in the covariance 


E(¢(x)(y)) = const. |x — y| ^^ 
ip.(x— 1 

- , dee "p [1] 

Here [ġ] >0 is the (canonical) dimension of the 
field, which for the standard massless free field is 
[o6] - (d — 2)/2. The latter is positive for d > 2. 
However, other choices are possible but in EQFT 
they are restricted by the Osterwalder-Schrader 
positivity. It is assured if [ó]—(d — a)/2, with 
0 « o € 2. If o € 2, we get a generalized free field. 
Observe that the covariance is singular for x — y and 
this singularity is responsible for the ultraviolet 
divergences of QFT. This singularity has to be initially 
cut off and there are many ways to do this. A simple 
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way is as follows. Let u(x) be a smooth, rotationally 
invariant, positive-definite function of fast decrease. 
Examples of such functions are legion. Observe that 


we dj x 一 
pm — pul (TY 
Ix — y| const. | r! «( ) [2] 


0 


as can be seen by scaling in /. We define the unit 
ultraviolet cutoff covariance C by cutting off at the 
lower end point of the / integration (responsible for 
the singularity at x — y) at / — 1, 


o-n fre) os 


C(x — y) is positive-definite and everywhere smooth. 
Being positive-definite, it qualifies as the covariance 
of a Gaussian probability measure denoted uc on a 
function space € (which it is not necessary to specify 
any further). The covariance C being smooth implies 
that the sample fields of the measure are uc almost 
everywhere sufficiently differentiable. 


Remark Note that, more generally, we could have 
cut off the lower end point singularity in [1] at any 
€ > 0. The e-cutoff covariance is related to the unit 
cutoff covariance by a scale transformation (defined 
below) and we will exploit this relation later. 


Let L > 1 be any real number. We define a scale 
transformation Sz, on fields à by 


Sp(x) = Ll G(T) 4] 
on covariances by 
Sr Ce — 3 = Ls c(- 2) [5] 


and on functions of fields F(@) by 
SLF(0) = F(S19) [6] 


The scale transformations form a multiplicative 
group: $7 = Sj». 
Now define a fluctuation covariance Tr: 


ne-»-f fr": m 


DL;(x-— y) is smooth, positive-definite and of fast 
decrease on scale L. It generates a key scaling 
decomposition 


C(x — y) = Fr(x — y) + S1 C(x — y) [8] 


Iterating this, we get 


DO 


C(x — y) 2 > T«(x — y) [9] 


n=0 
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where 
ul said: _ wy) — T—2nldlp. (X — 
Lyle =y) = SpT L(x —y) = L ri( I ) [10] 


The functions Ty(x — y) are of fast decrease on scale 
pu. 

Thus, [9] achieves the decomposition into a sum 
over increasing length scales as desired. Being 
positive definite, DL, qualify as covariances of 
Gaussian probability measures, and therefore 
Lic = G9, o ur,. Correspondingly introduce a family 
of independent Gaussian random fields Cna, called 
fluctuation fields, distributed according to yr,. Then 


9 = + [11] 
n=0 


Note that the fluctuation fields ¢, are slowly varying 
over length scales L”. In fact, an easy estimate using 
a Tchebycheff inequality shows that, for any 7 > 0, 


|x= y| < L” 
=> puc(|es(x) — GCy)| = y) € const. Y^ [12] 


which reveals the slowly varying nature of ¢, on 
scale L". Equation [11] is an example of a multiscale 
decomposition of a Gaussian random field. 

The above implies that the uc integral of a function 
can be written as a multiple integral over the fields n. 
We calculate it by integrating out the fluctuation fields 
C, step by step, going from shorter to longer length 
scales. This can be accomplished by the iteration of a 
single transformation TT , a renormalization group trans- 
formation, as follows. Let F(@) be a function of fields. 
Then we define a RG transformation F — T; F by 


(T,F)(¢) = Stur, * F(¢) 
z J dur, (OF(C+S:6) 3 


Thus the renormalization group transformation 
consists of a convolution with the fluctuation 
measure followed by a rescaling. 


Semigroup Property 

The discrete RG transformations form a semigroup: 
Ti Tre — Tye. for all z> 0 [14 

To prove this, we must first see how scaling 


commutes with convolution with a measure. We 
have the property 


Hr, * SLF = SLST «F [15] 


To see this, observe first that if ¢ is a Gaussian 
random field distributed with covariance I’; then the 


Gaussian field S; is distributed according to S; T';. 
This can be checked by computing the covariance of 
SiC. Now the left-hand side of [15] is just the 
integral of F(SLC +SLġ) with respect to dy, (C). 
By the previous observation, this is the integral of 
F(G + Si.) with respect to dus, r, (C), and the latter is 
the right-hand side of [15]. Now we can check the 
semigroup property trivially: 


Ti Typ F = Spur, * Sieber,» * F 
= SiSp usur, * Mrin * F 
= Spiiry esr, *F 
= Spear Ai x F 


= Tisa F [16] 


We have used the fact that Dj» + Sj4 T; — Tp». This 
is because Sj,L; has the representation [7] with 
integration interval changed to [L”, L"*! 

We note some properties of Tr. T; has an unique 
invariant measure, namely pic: for any bounded 
function F, 


/au TF = [suc 17 


To understand [17], recall the earlier observation 
that if ¢ is distributed according to the covariance C, 
then $;ó is distributed according to S; C. By [8], 
I; 4- Sj; C C. Therefore, 


/dheTrF = Jancsue * F 
= J apss cur, e 


m | dep 18 


The uniqueness of the invariant measure follows 
from the fact that the semigroup Ty, is realized by a 
convolution with a probability measure and, there- 
fore, is positivity improving: 

F >U, pe ae. = TLF > 0 Ae ae. 

Finally, note that T; is a contraction semigroup 
on L?(duc) for 1 € p < oo. To see this, note that 
since T; is a convolution with a probability measure 
T,F — ps, rr * SLF, we have, via Hólder's inequal- 
ity, |T; Fl < T,|F\’. Then use the fact that uc is an 
invariant measure. 


Eigenfunctions 


Let :p,,,:(ó(x)) be a C Wick-ordered local mono- 
mial of m fields with n derivatives. Define 


Pam) = 人 dx:Pumic(@) 


The functions P,,,(X) play the role of eigenfunc- 
tions of the RG transformation Tr up to a scaling of 
volume: 


Tr P, X) = Lop, (71x) [19] 


Because of the scaling in volume, P, ,,(X) are not 
true eigenfunctions. Nevertheless, they are very 
useful because they play an important role in the 
analysis of the evolution of the dynamical system 
which we will later associate with Tj. They are 
classified as expanding (relevant), contracting 
(irrelevant) or central (marginal), depending on 
whether the exponent of L on the right-hand side 
of [19] is positive, negative, or zero, respectively. 
This depends, of course, on the space dimension 
d and the field dimension [à]. 

Gaussian measures are of limited interest. But we 
can create new measures by perturbing the Gaus- 
sian measure jic with local interactions. We cannot 
study directly the situation where the interactions 
are in infinite volume. Instead, we put them in a 
very large volume which will eventually go to 
infinity. We have a ratio of two length scales, one 
from the size of the diameter of the volume and the 
other from the ultraviolet cutoff in jc, and this 
ratio is enormous. The RG is useful whenever there 
are two length scales whose ratio is very large. It 
permits us to do a scale-by-scale analysis and at 
each step the volume is reduced at the cost of 
changing the interactions. The largeness of the ratio 
is reflected in the large number of steps to be 
accomplished, this number tending eventually to 
infinity. This large number of steps has to be 
controlled mathematically. 


Perturbation of the Gaussian Measure 


Let Ay — [- LN/2, LN /2]^ c R^ be a large cube in 
R^. For any XC Ay, let Vo(X,¢) be a local 
semibounded function where the fields are 
restricted to the set X. Here “local” means that if 
X,Y are sets with disjoint interiors then Vo(X U 
Y,¢)= Vo(X) + Vo(Y). Consider the integral 
(known as the partition function in QFT and 
statistical mechanics) 


Z(AN) = | dyc(d)zo(An, 9) [20] 


where 

zo(X, 9) =e 069) [21] 
and 
1 


T" duc(ó)e- Yon?) [22] 


du? (An, 9) Z(An) 
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is the corresponding probability measure. Vo is 
typically not quadratic in the fields and therefore 
leads to a non-Gaussian perturbation. For example, 


Vo(X, 6) = 人 dx(e|Ve(x) 
+ god*(x) + uoo (x)) [23] 


where we take go > 0. The integral [20] is well 
defined because the sample fields are smooth. 

-We now proceed to the scale-by-scale analysis 
mentioned earlier. Because pc is an invariant 
measure of Tı, we have the partition function 
Z(An) in the volume AN as 


Z(AN) = | ducld)zo(An, d) 


= / düc(d)Tre(Aw,d)  - (24 


The integrand on the right-hand side is a new 
function of fields which, because of the final scaling, 
live in the smaller volume Ax. ;. This leads to the 
following definition: 


Z1(An-1, 9) = Trzo(Aw, ¢) [25] 


Because Vo is local, zo has a factorization property for 
unions of sets with disjoint interiors. This is no longer 
the case for z;. Wilson noted that, nevertheless, the 
integral is well approximated by an integrand which 
does, but the approximator has new coupling con- 
stants. The phrase “well approximated” is what all the 
rigorous work is about and this was not evident in the 
early Wilson era. The idea is to extract out a local part 
and also consider the remainder. The local part leads 
to a flow of coupling constants and the (unexponen- 
tiated) remainder is an irrelevant term. This operation 
and its mathematical control is an essential feature of 
RG analysis. 

Iterating the above transformation, we get, for all 
O<n<N, 


&n4l (AN—n-1 ' h) — Tz A, $) [26] 


After N iterations, we get 


ZA) = j ERT T O, 27] 


where Ao is the unit cube. To take the limit as 
N — oo, we have to control the infinite sequence 
of iterations. We cannot hope to control the 
infinite sequence at the level of the entire partition 
function. Instead, one chooses representative coor- 
dinates for which the infinite sequence has a 
chance of having a meaning. The coordinates are 
provided by the coupling constants of the 
extracted local part and the irrelevant terms (an 
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approximate calculation of the flow of coupling 
constants is given in the next section). The 
existence of a global trajectory for such coordi- 
nates helps us to control the limit for moments of 
the probability measure (correlation functions). 
The question of coordinates and the representation 
of the irrelevant terms will be taken up in the 
section “Rigorous RG analysis.” 


Ultraviolet Cutoff Removal 


The next issue is ultraviolet cutoff removal in field 
theory. This problem can be put into the earlier 
framework as follows. Let en be a sequence of 
positive numbers which tend to 0 as N-— oo. 
Following the remark after [3], we replace the unit 
cutoff covariance C by the covariance Ce, defined 
by taking ex (instead of 1) as the lower end point in 
the integral [3]. Thus, ex acts as a short-distance or 
ultraviolet cutoff. It is easy to see that 


Cen (x = y) = Sey C(x = y) [28] 


Consider the partition function Z,,(A) in a cube 
A — [- R/2, R/2]*: 


Zi) n | duc. (pje Vo (A6 Bn fin) [29] 


where Vo is given by [23] with go, Wo replaced by 
EN, LN, respectively. By dimensional analysis we can 
write 
~ 216|—d+2 T 4|ó|—d 
Ey = eue + [3 in = Viel )g, 
[30] 
- — _(2[¢|—d) 
HN = EN H 
where g,&,j: are dimensionless parameters. Now @ 
distributed according to Ce, equals in distribution 
Sey% distributed according to C. Therefore, choosing 
en =L, we get 


Ley UA) 一 E actin) 


= f dnce Vo(An 0.8.14) [31] 


where AN =[—LNR/2,LNR/2]*. Thus, the field 
theory problem of removing the ultraviolet cutoff, 
that is, taking the limit ey — 0, has been reduced to 
the study of a statistical mechanical model in a very 
large volume. The latter has to be analyzed via RG 
iterations as before. 


Critical Field Theories 


As mentioned earlier, we have to study the flow of 
local interactions as well as that of irrelevant terms. 
Together they constitute the RG trajectory and we 
have to prove that it exists globally. In general, the 


trajectory will tend to explode after a large number 
of iterations due to growing relevant terms (char- 
acterized in terms of the expanding Wick monomials 
mentioned earlier). Wilson pointed out that the 
saving factor is to exploit fixed points and their 
invariant manifolds by tuning the initial interaction 
so that the RG has a global trajectory. This leads to 
the notion of a critical manifold which can be 
defined as follows. A fixed point will have contract- 
ing and/or marginal attractive directions besides the 
expanding ones. In the language of dynamical 
systems, the critical manifold is the stable or center 
stable manifold of the fixed point in question. This 
is determined by a detailed study of the discrete 
flow. In the examples-above, it amounts to fixing 
the initial “mass” parameter j4)—J4(go) with a 
suitable function Me such that the flow remains 
bounded in an invariant set. The critical manifold is 
then the graph of a function from the space of 
contracting and marginal variables to the space of 
ws which remains invariant under the flow. 
Restricted to it the flow will now converge to a 
fixed point. All references to initial coupling 
constants have disappeared. The result is known as 
a critical theory. 

Critical theories have been rigorously constructed 
in a number of cases. Take the standard ó* in d 
dimensions. Then [ó] ^ (d — 2)/2. For d > 5 the ¢* 
interaction is irrelevant and the Gaussian fixed point 
is attractive with one unstable direction (corre- 
sponding to 1). In this case one can prove that the 
interactions. converge exponentially fast to the 
Gaussian fixed point on the critical manifold. For 
d —4 the interaction is marginal and the Gaussian 
fixed point attractive for g > 0. The critical theory 
has been constructed by Gawedzki and Kupainen 
(1984) starting with a sufficiently small coupling 
constant. The fixed point is Gaussian (interactions 
vanish in the limit) and the convergence rate is 
logarithmic. This is thus a mean-field theory with 
logarithmic corrections, as expected on heuristic 
grounds. The mathematical construction of the 
critical theory in d —3 is an open problem. (It is 
expected to exist with a non-Gaussian fixed point, 
and this is indicated by the perturbative ¢ expansion. 
of Wilson and Fisher in 4 — e dimensions.) However, 
the critical theory for d=3 for [6]— (3 — €)/4 
for €» 0 held very small has been rigorously 
constructed by Brydges et al. (2003). This theory 
has a nontrivial hyperbolic fixed point of O(e). The 
stable manifold is constructed in a small neighbor- 
hood of the fixed point. Note that the covariance 
without cutoff is Osterwalder-Schrader positive and 
thus this is a candidate for a nontrivial EQFT. For 
€=1 we have the standard situation in d — 3, and 


this remains open, as mentioned earlier. A very 
simplified picture of the above is furnished by the 
perturbative computation in the next section. 


Unstable Fixed Points 


We may attempt to construct field theories around 
unstable fixed points. In this case the initial 
parameters have to be adjusted as functions of the 
cutoff in such a way as to stabilize the flow in the 
neighborhood of the fixed point. This may be called 
a genuine renormalization. A famous example of 
this is pure Yang-Mills theory in d —4, where the 
Gaussian fixed point has only marginal unstable 
directions. Balaban in a series of papers ending in 
Balaban (1989) considered Wilson’s lattice cutoff 
version of Yang-Mills theory in d —4 with initial 
coupling fixed by the two-loop asymptotic freedom 
formula. He proved, by lattice RG iterations, that in 
the weak-coupling regime the free energy per unit 
volume is bounded above and below by constants 
independent of the lattice spacing. Instability of the 
flow is expected to lead to mass generation for 
observables but this is a famous open problem. 
Another example is the standard nonlinear sigma 
model for d=2. Here too the flow is unstable 
around the Gaussian fixed point and we can set the 
initial coupling constant by the two-loop asymptotic 
freedom formula. Although much is known via 
approximation methods (as well as by methods 
based on integrable systems) this theory remains to 
be rigorously constructed as an EQFT. 

Let us now consider a relatively simpler 
example, that of constructing a massive super- 
renormalizable scalar field theory. This has been 
studied in d=3, with [o9] - (d — 2/72— 1/2. We 
get £—£,g— LNg, y — L'?N ji, and g is taken to be 
small. € is marginal, whereas g,j are relevant 
parameters and grow with the iterations. After N 
iterations, they are brought up to g,/ together 
with remainders. This realizes the so-called 
massive continuum ó* theory in d=3, and this 
has been mathematically controlled in the exact 
RG framework. This was proved by Brydges, 
Dimock, and Hurd and earlier by Benfatto, 
Cassandro, Gallavotti, and others, (see the refer- 
ences in Brydges et al. (1998) and Benfatto and 
Gallavotti (1995)). 


The Exact RG as a Continuous Semigroup 


The discrete semigroup defined in [13] of the previous 
section has a natural continuous counterpart. Just take 
L to be a continuous parameter, L=e',t > 0, and 
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write by abuse of notation T;, S+, [+ instead of Te, etc. 
The continuous transformations T;, 


T,F = Sir, xF [32] 
give a semigroup 
Tei, — T4 [33] 


of contractions on L^(duc) with uc as invariant 
measure. One can show that T, is strongly contin- 
uous and, therefore, has a generator which we will 
call £. This is defined by 


F [34] 


whenever this limit exists. This restricts F to a 
suitable subspace D(L) c L^(duc). D(L) contains, 
for example, polynomials in fields as well as twice- 
differentiable bounded cylindrical functions. The 
generator £ can be easily computed. To state it, we 
need some definitions. Define (D”F)(@;/f1,...,fn) as 
the mth tangent map at ó along directions f;,...,f,. 
The functional Laplacian Ai is defined by 


A;F(ó) = / dug(C)(D2F)(4;¢,0) BS 


where [ =u. Define an infinitesimal dilatation 
operator 


Dó(x) = x- Volx) [36] 
and a vector field X, 


XF = —[§|(DF)(¢;4) —(DF\(¢;D4) [87] 


Then, an easy computation gives 
£ = 5 Ab +4 [38] 


T, is a semigroup with £ as generator. Therefore, 
T, —e'^, Let F,(¢)=T;F(¢). Then F; satisfies the 
linear PDE 

OF, 

Ug LF, [39] 
with the initial condition Fo — F. This evolution 
equation assumes a more familiar form if we write 
F, =e“, V, being known as the effective potential. 
We get 

OV, 1 


where 


(V.()), - (V.(9)), = J dup(C)\((DV,)(4:0)? [41] 


and Vo=V. This infinite-dimensional nonlinear 
PDE is a version of Wilson’s flow equation. 
Note that the linear semigroup T; acting on 
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functions induces a semigroup X, acting non- 
linearly on effective potentials giving a trajectory 
V, -— Ri Vo. 

Equations like the above are notoriously difficult 
to control rigorously, especially for large times. 
However, they may be solved in formal perturbation 
theory when the initial Vo is small via the presence 
of small parameters. In particular, they give rise 
easily to perturbative flow equations for coupling 
constants. They can be obtained to any order but 
then there is the remainder. It is hard to control the 
remainder from the flow equation for effective 
potentials in bosonic field theories. They require 
other methods based on the discrete RG. Never- 
theless, these approximate perturbative flows are 
very useful for getting a preliminary view of the 
flow. Moreover, their discrete versions figure as an 
input in further nonperturbative analysis. 


Perturbative Flow 


It is instructive to see this in second-order perturba- 
tion theory. We will simplify by working in infinite 
volume (no infrared divergences can arise because 
P(x — y) is of fast decrease). Now suppose that we 
are in standard $4 theory with [o] — (d — 2)/2 and 
d » 2. We want to show that 


Ve / dx (€:|V6(x)P: +r :0(x): 
+ pi (x) 3 [42] 


satisfies the flow equation in second order modulo 
irrelevant terms provided the parameters flow 
correctly. We will ignore field-independent terms. 
The Wick ordering is with respect to the covariance 
C of the invariant measure. The reader will notice 
that we have ignored a $^ term which is actually 
relevant in d —3 for the above choice of [o]. This is 
because we will only discuss the d —3 case for the 
model discussed at the end of this section and for 
this case the $^ term is irrelevant. We will assume 
that &, Ju are of order O(g?). Plug in the above in 
the flow equation. The quantity Aj" :P, m: repre- 
sents one of the terms above with m fields and n 
derivatives. Because £ is the generator of the 
semigroup T, we have 


O 
an Wer tP nse! 
(5 c) t 4 


ae nm |. 。 
= (FE - d- mle) -mA ) Pami deal 


Next turn to the nonlinear term in the flow equation 
and insert the 内 term (the others are already of 
order O(g^). This produces a double integral of 


T(x — y):ó(x):ó(y):, which after complete Wick 
ordering, gives 


一 x 16 | dx dy P (x — y) (: (x) oo)’ 
+ 9C(x — y):ó(x)' p(y): +36C(x — y) h(x) O(y): 
+6C(x — yy) [44 


Consider the nonlocal ó* term. We can localize it by 
writing 


(x) (9): = 4: (0G9* + Gy) 
-(éQ —-é0yy) [5 


The local part gives a ó* contribution and the last 
term above gives rise to an irrelevant contribution 
because it produces additional derivatives. The 
coefficients are well defined because C,T are smooth 
and I'(x — y) is of fast decrease. Now the nonlocal ¢* 
term is similarly localized. It gives a relevant local ¢* 
contribution as well as a marginal |Vó|^ contribu- 
tion. Finally, the same principle applies to the 
nonlocal $ contribution and gives rise to further 
irrelevant terms. Then it is easy to see by matching 
that the flow equation is satisfied in second order up 
to irrelevant terms (these would have to be compen- 
sated by adding additional terms in V;) provided 


dg, 


rim (4 — d)g, — ag; + O(g;) 

Ct — ay — bg? + O(g) 46 
dé, 

S = og? + O(g?) 


where a, b,c are positive constants. We see from the 
above formulas that, up to second order in g^, as 
t — oo, gi — 0 for d > 4. In fact, for d > 5 the decay 
rate is O(e*) and for d=4 the rate is O(t ^). 
However, to see if V; converges, we also have to 
discuss the u, & flows. It is clear that in general the 
p; flow will diverge. This is fixed by choosing the 
initial pọ to be the bare critical mass. This is 
obtained by integrating up to time ¢ and then 
expressing /0 as a function of the entire g trajectory 
up to time ¢. Assume that jj is uniformly bounded 
and take t — oo. This gives the critical mass as 


Hio = bf dse ^g? = pe(go) [47] 


This integral converges for all cases discussed above. 
With this choice of jjj we get 


= b | dse ^g, |48] 


and this exists for all £ and converges as ft — oc. 
Now consider the perturbative € flow. It is easy to 
see from the above that for d > 4,£, converges as 
t — oco. 

We have not discussed the d=3 case because the 
perturbative g fixed point is of order O(1). But 
suppose we take, in the d —3 case, [ġ] ^ (3 一 e)/4 
with € > 0 held small as in Brydges et al. (2003). 
Then the above perturbative flow equations are 
easily modified (by taking account of [43]) and we 
get, to second order, an attractive fixed point 
g, = O(c) of the g flow. The critical bare mass po 
can be determined as before and the & flow 
converges. The qualitative picture obtained above 
has a rigorous justification. 


Rigorous RG Analysis 


We will give a brief introduction to rigorous RG 
analysis in the discrete setup in the section *The RG 
as a Discrete Semigroup" concentrating on the 
principal problems encountered and how one 
attempts to solve them. Our approach is borrowed 
from Brydges et al. (2003). It is a simplification 
of the methods initiated by Brydges and Yau in 
(1990) and developed further by Brydges et al. 
(1998). The reader will find other approaches to 
rigorous RG methods in the selected references, such 
as those of Balaban, Gawedzki and Kupiainen, 
Gallavotti, and others. We will take as a concrete 
example the scalar field model introduced earlier. 

At the core of the analysis is the choice of good 
coordinates for the partition function density, z, of 
the section “The RG as a Discrete Semigroup”. This 
is provided by a polymer representation (defined 
below) which parametrizes z by a couple (V,K), 
where V is a local potential and K is a set function 
also depending on the fields. Then the RG transfor- 
mation Tr maps (V,K) to a new (V,K). (V, K) 
remain good coordinates as the volume tends to oo, 
whereas z(volume) diverges. There exist norms 
which are suited to the fixed-point analysis of (V, K) 
to new (V, K). Now comes the important point: z 
does not uniquely specify the representation (V, K). 
Therefore, we can take advantage of this nonunique- 
ness to keep K small in norm and let most of the 
action of T; reside in V. This process is called 
extraction in Brydges et al. (2003). It makes sure that 
K is an irrelevant term, whereas the local flow of V 
gives rise to discrete flow equations in coupling 
constants. We will not discuss extraction any further. 
In the following, we introduce the polymer represen- 
tation and explain how the RG transformation acts 
on it. 
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To proceed further, we first introduce a simplifi- 
cation in the setup used in the section “The RG as a 
Discrete Semigroup." Recall that the function u 
introduced in [3] was smooth, positive definite, and 
of rapid decrease. We will simplify further by 
imposing the stronger property that it is actually of 
finite range: u(x) — 0 for |x| > 1. We say that u is of 
finite range 1. It is easy to construct such functions. 
For example, if g is any smooth function of finite 
range 1/2, then u = g * g is a smooth positive-definite 
finction of finite range 1. This implies that the 
fluctuation covariance I’; of [7] has finite range L. 
As a result, T, in [10] has finite range L"*! and the 
corresponding fluctuation fields ¢,(x) and ¢,(y) are 
independent when |x — y| > L”*!. 


Polymer Representation 


Pave R^ with closed cubes of side length 1 called 
1-blocks or unit blocks denoted by A, and suppose 
that A is a large cube consisting of unit blocks. A 
connected polymer X C A is a closed connected 
subset of these unit blocks. A polymer 
activity K(X,@) is a map X,@ — R where the fields 
ó depend only on the points of X. We will set 
K(X, o) =0 if X is not connected. A generic form of 
the partition function density z(A, ó) after a certain 
number of RG iterations is 


x N 
4) = P gg > e e LL [49] 


N=0 [XN 


Here X; C A are disjoint polymers, X = | JX;, and 
X, — AX. V is a local potential of the form [23] with 
parameters Ë, g, 2. We have suppressed the ¢-depen- 
dence. Initially, the activities K; = 0, but they will arise 
under RG iterations and the form [49] remains stable, 
as we will see. The partition function density is thus 
parametrized as a couple (V, K). 


Norms for Polymer Activities 


Polymer activities K(X, ó) are endowed with a norm 
|K(X)||, which must satisfy two properties: 


XY = 0 = ||Ki(X)K2(¥)|\< |K (X)| Ka CY) 
T, K(X)I| < e" KO [50] 


where X is the interior of X and |X| is the number 
of blocks in X. c is a constant of order O(1). The 
norm measures (Fréchét) differentiability proper- 
ties of the activity K(X, ó) with respect to the field 
@ as well as its admissible growth in ¢. The 
growth is admissible if it is jc integrable. The 
second property above ensures the stability of 
the norm under RG iteration. For a fixed polymer 
X, the norm is such that it gives rise to a Banach 
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space of activities K(X). The final norm || - ||, 
incorporates the previous one and washes out the 
set dependence, 


|K]|4 = sup » A(X) ||K(X)|| [51] 
XDA 
where A(X) = L'¢+#)|*|, This norm essentially ensures 


that large polymers have small activities. The details 
of the above norms can be found in Brydges et al. 
(2003). 

The RG operation map f is a composition of two 
maps. The RG iteration map z — T;z induces a map 
V 一 V, and a nonlinear map TsK -+ K=T;(K). 
We then compose this with a (nonlinear) extraction 
map E which takes out the expanding (relevant) 
parts of K = E(K)=K' and compensates the local 
potential V, — V' such that T; z remains invariant. 
We denote by f the composition of these two maps 
with 


po V' = fy(V, K), K — K = fx(V,K) [52] 


The Map 7, 


Consider applying the RG map T; to [49]. The map 
consists of a convolution jip,* followed by the 
rescaling Sz. In the integration over the fluctuation 
field C, we will exploit the independence of C(x) and 
C(y) when |x — y| > L. To do this, we pave A by 
closed blocks of side L, called L-blocks, so that 
each L-block is a union of 1-blocks. Let X^ be 
the L-closure of a set X, namely the smallest union 
of L-blocks containing X. The polymers will be 
combined into L-polymers which are, by definition, 
connected unions of L blocks. The combination 1s 
performed in such a way that the new polymers are 
associated to independent functionals of ¢. 

Let V(X,¢)), to be chosen later, be a local 
potential independent of C. For a coupling constant 
sufficiently small, there is a bound 


[e "par pit [53] 


We assume that V is so chosen that the same bound 
holds when V is replaced by V. Define 


P(A,¢, 9) =e 


Then we have 


-V(A$) _ e- V(A 9) [54] 


—V(X.,C--9) -V(X.,C--0) 


e — € 


e-V^9 + P(A,C, d)) [55] 


where Xe is the closure of X.. Expand out the 
product and insert into the representation [49] for 


z(A,C4- 9)). We then rewrite the resulting sum in 
terms of L-polymers. The sum splits into a sum over 
connected components. Define, for every connected 
L-polymer Y, 


1 
BK(Y) = 
| | "meni 
N M 
x e VO) T] KX [| P(A;) [56] 
(Xj), (Ai) Y j=1 i=] 


where Xo = Y\( U X;) U( U Aj) and the sum over the 
distinct A;, and disjoint 1-polymers X; is such that 
their L-closure is Y. Equation [49] now becomes 


"p BK(Y [57] 


where the sum is over und connected closed 
L-polymers. We now perform the fluctuation inte- 
gration over C followed by the rescaling. Now V(Y.) 
is independent of C. The C-integration sails through 
and then factorizes because the Y;, being disjoint 
closed L-polymers, are separated from each other by 
a distance >L. The rescaling brings us back to 1- 
polymers and reduces the volume from A to L^!A. 
Therefore, 


eL) = Tye) 
1 HT 
= UNI, 2 e Vel " Hausoo ;) [58] 
Decr ak N j= 


where the sum is over disjoint 1-polymers, 
X.=LA\X. By definition V;(A)=S,;V(LA) and 
(T,BK)(Z)=S_ pur, * BK(LZ). This shows that the 
representation [49] is stable under iteration and, 
furthermore, gives us the map 


V — Vr 

"a [59] 
K—K- Tj, (K) = T, BK 

The norm boundedness of K implies that T,(K) is 
norm bounded. We see from the above that a 
variation in the choice of V is reflected in the 
corresponding variation of K. The extraction map € 
now takes out from K the expanding parts and then 
compensates it by a change of Vi in such a way that 
the representation [58] is left invariant by the 
simultaneous replacement V; —^ V', K — K' — &(K). 
The extraction map is nonlinear. Its linearization is a 
subtraction operation and this dominates in norm the 
nonlinearities, (Brydges et al. 1998). 

The map V — Vj, — V' leads to a discrete flow of 
the coupling constants in V. It is convenient to write 
K = KF** + R, where R is the remainder. Then the 
coupling constant flow is a discrete version of the 


continuous flows encountered in the last section, 
together with remainders which are controlled by the 
size of R. In addition, we have the flow of K. The 
discrete flow of the pair (coupling constants, K) can be 
studied in a Banach space norm. Once one proves that 
the nonlinear parts satisfy a Lipshitz property, the 
discrete flow can be analyzed by the methods of stable- 
manifold theory of dynamical systems in a Banach 
space context. The reader is referred to the article by 
Brydges et al. (2003) for details of the extraction map 
and the application of stable-manifold theory in the 
construction of a global RG trajectory. 


Further Topics 
Lattice RG Methods 


Statistical mechanical systems are often defined on a 
lattice. Moreover, the lattice provides an ultraviolet 
cutoff for Euclidean field theory compatible with 
Osterwalder-Schrader positivity. The standard lat- 
tice RG is based on Kadanoff-Wilson block spins. 
Its mathematical theory and applications have 
been developed by Balaban, and Gawedzki and 
Kupiainen (see Gawedzki and Kupiainen (1986) and 
references therein). This leads to multiscale decom- 
positions of the Gaussian lattice field as a sum of 
independent fluctuation fields on increasing length 
scales. Brydges et al. (2004) have shown that 
standard Gaussian lattice fields have multiscale 
decompositions as a sum of independent fluctuation 
fields with the finite-range property introduced in 
the last section. This permits the development of 
rigorous lattice RG theory in the spirit of the 
continuum framework of the previous section. 


Fermionic Field Theories 


Field theories of interacting fermions are often 
simpler to handle than bosonic field theories. Because 
of statistics, fermion fields are bounded and pertur- 
bation series converges in finite volume in the 
presence of an ultraviolet cutoff. The notion of 
studying the RG flow at the level of effective 
potentials makes sense. At any given scale, there is 
always an ultraviolet cutoff and the fluctuation 
covariance being of fast decrease provides an infrared 
cutoff. This is illustrated by the work of Gawedzki 
and Kupiainen (1985), who gave a nonperturbative 
construction in the weak effective coupling regime of 
the RG trajectory for the Gross-Neveu model in two 
dimensions. This is an example of a model with an 
unstable Gaussian fixed point where the initial 
coupling has to be adjusted as a function of the 


Exact Renormalization Group 281 


ultraviolet cutoff consistent with ultraviolet asympto- 
tic freedom so as to stabilize the flow. 
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A Brief History 


The *Falicov-Kimball model" was first considered by 
Hubbard and Gutzwiller during 1963-65 as a simpli- 
fication of the Hubbard model. In 1969, Falicov and 
Kimball introduced a model that included a few extra 
complications, in order to investigate metal-insulator 
phase transitions in rare-earth materials and transition- 
metal compounds (Falicov and Kimball 1969). Experi- 
mental data suggested that this transition is due to the 
interactions between electrons in two electronic states: 
nonlocalized states (itinerant electrons), and states that 
are localized around the sites corresponding to the 
metallic ions of the crystal (static electrons). 

A tight-binding approximation leads to a model 
defined on a lattice (the crystal) and two species of 
particles are considered. The first species consists of 
spinless quantum fermions (we refer to them as 
*electrons"), and the second species consists of localized 
holes or electrons (*classical particles"). Electrons hop 
between nearest-neighbor sites but classical particles do 
not. Both species obey Fermi statistics (in particular, the 
Pauli exclusion principle prevents more than one 
particle of a given species to occupy the same site). 
Interactions are on-site and thus involve particles of 
different species; they can be repulsive or attractive. 

The very simplicity of the model allows for a 
broad range of applications. It was studied in the 
context of mixed valence systems, binary alloys, and 
crystal formation. Adding a magnetic field yields the 
flux phase problem. The Falicov-Kimball model can 
also be viewed as the simplest model where 
quantum particles interact with classical fields. 

The fifteen years following the introduction of the 
model saw studies based on approximate methods, 
such as Green's function techniques, that gave rise to 
a lot of confusion. A breakthrough occurred in 1986 
when Brandt and Schmidt, and Kennedy and Lieb, 
proposed the first rigorous results. In particular, 


Kennedy and Lieb showed in their beautiful paper 
that the electrons create an effective interaction 
between the classical particles and that a phase 
transition takes place for any value of the coupling 
constant, provided the temperature is low enough. 
Many studies by mathematical physicists fol- 
lowed and several results are presented in this 
short survey. Recent years have seen an increasing 
interest from condensed matter physicists. We 
encourage interested readers to consult the reviews 
by Freericks and Zlatić (2003), Gruber and Macris 
(1996), and Jedrzejewski and Lemanski (2001). 


Mathematical Setting 
Definitions 


Let A C Z^ denote a finite cubic box. The config- 
uration space for the classical particles is 


Q4 = (0,1)^ = {w = (wx) : x € A, and & = 0,1) 


where w, — 0 or 1 denotes the absence or presence of 
a classical particle at the site x. The total number of 
classical particles is Ne(w)= J pea «x. The Hilbert 
space for the spinless quantum particles (*elec- 
trons") is the usual fermionic Fock space 


|A| 
Fa = Han 
N=0 


where Hn n is the Hilbert space of square summable, 
antisymmetric, complex functions V = V(x1,..., XN) 
of N variables x; € A. Let a! and a, denote the 
standard creation and annihilation operators of an 
electron at x; recall that they satisfy the antic- 
ommutation relations 


{as ay} = 0, CN = 0, 


The Hamiltonian for the Falicov-Kimball model is an 
operator on Fa that depends on the configurations of 
classical particles. Namely, for w € Q4, we define 


Ha(w) = — > alay — US wala, 
x,yceA x€A 
Ix—y|=1 


lax. a\} = Gy 
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The first term represents the kinetic energy of the 
electrons. The second term represents the on-site 
attraction (U > 0) or repulsion (U < 0) between 
electrons and classical particles. 

. The Falicov-Kimball Hamiltonian can be written 
with the help of a one-body Hamiltonian 54, which 
is an operator on the Hilbert space for a single 
electron /^(A). Indeed, we have 


- M hyy(w)alay 


x,yeA 


The matrix ha(w) — (bxy(w)) is the sum of a hopping 
matrix (adjacency matrix) £4, and of a matrix v,(w) 
that represents an external potential due to the 
classical particles. Namely, we have 


bu) = —tzy — Uux6sy 


where tyy is one if x and y are nearest neighbors, and is 
zero otherwise. The spectrum of £4 lies in (—2d, 2d), 
and the eigenvalues of v,(w) are —U (with degeneracy 
N,(w)) and 0 (with degeneracy |A| —N,(w)). Denoting 
A;(A) the eigenvalues of a matrix A, it follows from the 
minimax principle that 


Aj(A) — BI < Aj(A + B) € Aj(A) + ||B] 


Let A1(w) € A2(w) € =- < Ajj(w) be the eigenvalues 
of ha(w). Choosing A=va(w) and B=t, in the 
inequality above, we find that for U > 0, 


—U —2d < X(w) « —U+2d for j=1,...,N¢(w) 
—2d < X(w) « 2d for j 2 Nc(w) - 1,...,|A| 


In particular, for any configuration w and any A, 
Spec ha(w) C (—U — 2d, —U + 2d) U (—2d, 2d) 


Thus, for U > 4d, the spectrum of PA(w) has the 
“universal” gap (—U + 2d, —2d). A similar property 
holds for U < —4d. 


Canonical Ensemble 


A fruitful approach towards understanding the 
behavior of the Falicov-Kimball model is to first 
fix the configuration of the classical particles, and 
then to introduce the ground-state energy E,(Ne, w) 
as the lowest eigenvalue of H,(w) in the subspace 


HA, N.: 


EA (Ne, w) = QV|HA( (w) JI = 


-oN 

A typical problem is to find the set of ground- 
state configurations, that is, the set of configura- 
tions that minimize E,(N.,w) for given Ne and 


N.=N,(w). 


inf 
V € Ne || V||21 


In the case U > 4d and N.=N,(w), the ground- 
state energy E,(N,(w),w) has a convergent expan- 
sion in powers of U^': 


EA(Nc(w), w) 


Ez 
k>2 U 
k—2 
x Siaa ) m 
M m({x;}) —1 
|x; — x;1| — 1 
0 < m((xj)) < k 


where m(x1,...,XĻ) is the number of sites x; with 
wy, —0. The last sum also includes the condition 
|x — x1| = 1. Simple estimates show that the series is 
less than (2d/(U — 4d))N.(w). The lowest-order term 
is a nearest-neighbor interaction, 
_ zi > 0,1 —Wy 

{x,y}:|x—y|=1 


that favors pairs with different occupation numbers. 
Formula [1] is the starting point for most studies of 
the phase diagram for large U. A similar expansion 
holds for U < —4d and N, = |A| — 

A simple derivation of expansion [1] using Cauchy 
formula can be found in Gruber and Macris (1996). It 
can be extended to positive temperatures with the 
help of Lie-Schwinger series (Datta et al. 1999). 

Phase diagrams are better discussed in the limit of 
infinite volumes where boundary effects can be 
discarded. Let QP“ be the set of configurations on Z^ 
that are periodic in all d directions, and ()P*(p.) C QP* 
be the set of periodic configurations with density p.. 
For w € NP and pe € [0, 1], we introduce the energy 
per site in the infinite volume limit by 


e(pe,w) = lim S EA(Nesu) [2] 
azze |A| 


Here, the limit is taken over any sequence of 
increasing cubes, and Ne= |p,|A|] is the integer 
part of p,|A|. Existence of this limit follows from 
standard arguments. 

In the case of the empty configuration wx = 0, 
we get the well-known energy per site of free lattice 


electrons: for k € [—7, rl, let elk) = -5 POS ky; 
then 
e(pe,w = 0) = : J e(k) dk 
" E (22^ g(k) «er (pe) 


where er(pe) is the Fermi energy, defined by 


Jl 
fe = dk 
© (2n Sete) esto) 


The other simple situation is the full configu- 
ration wx 三 1， whose energy is e(pe,w=1)= 
e(l pe, W = 0) s Upe. 

Let elpe, pe) denote the absolute ground-state 
energy density, namely, 


e(pe, Pc) = inf  e(pe, w) 
WERP: (pe) 

Notice that e(pe, w) is convex in pe, and that e(pe, pe) 
is the convex envelope of {e(pe, w):w € QP (p,)}. It 
may be locally linear around some (pe, pe). This is 
the case if the infimum is not realized by a periodic 
configuration. The nonperiodic ground states can be 
expressed as linear combinations of two or more 
periodic ground states (“mixtures”). That is, for 1 < 
i <n there are a; > 0 with }>,a;=1, ws") € QPer. 
and p, such that 


pe — 9 aip, — pe — >》 aipe(w"”) 
i 


1 


and 


e( Pe, pc) - 3 aiel pt, w”) 
i 


The simplest mixture is the “segregated state” for 

densities pe < pe: take wm to be the empty 

configuration, w?) to be the full configuration, 
e) = 0, p? = Pef Pes and a2=1—-ay = pc. 

If d>2, a mixture between configurations w 
can be realized as follows. First, partition Z^ into 
domains Di U-:-UD, such that |D;|/|A| — a; and 
lÓD;|/|A| + 0 as A 7 Z2. Then, define a nonperiodic 
configuration w by setting wy = w! for x € D; (see the 
illustration in Figure 1). The canonical energy can be 
computed from |2], and it is equal to 


n 


e(pe,w) = inf ` ajel ph w'”) 
Td D» aux =Pe 1=1 


Furthermore, the infimum is realized by densities p% 
such that there exists jj with pe(ue,0 7) = p! for all 
i (see [4] below for the definition of pe(jie, w)). 


O000808000000000000 
Oeooooeooooeooooeo OO 
OOOeooooeooooeooooe 

Figure 1 A two-dimensional mixed configuration formed by 

periodic configurations of densities 0, 1/5, and 1/2. 
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We define the canonical ground-state phase 
diagram as the set of ground states w (either a 
periodic configuration or a mixture) that minimize 
the ground-state energy for given densities pe, p: 


Gean (Pe, pc) = {w : e(pe, w) = e(pe, Pc) and p(w) = pe} 


Grand-Canonical Ensemble 


Properties of the system at finite temperatures 
are usually investigated within the grand-canonical 
formalism. The equilibrium state is characterized by 
an inverse temperature 9 — 1/kgT, and by chemical 
potentials jie, fc, for the electrons and for the 
classical particles, respectively. In this formalism, 
the thermodynamic properties are derived from the 
partition functions 


Zn (B, lie t) = tt p, e PAPA [3] 


ZA(B, pe = Y CNZA, pe) 


WED, 


Here, Na = Y... alay is the operator for the total 
number of electrons. We then define the free energy by 


: 1 
FA(8, Ile; He) TT T ZA, Hes hic) 


The first partition function in [3] allows us to 
introduce an effective interaction for the classical 
particles, mediated by the electrons, by 


1 
Fx (8, He, Ics) = —picN-(w) = g log ZA(B, He, w) 
It depends on the inverse temperature 3. Taking the 
limit of zero temperature gives the corresponding 
ground-state energy of the electrons in the classical 
configuration w: 


Eq (He; Mes 02) - lim F; (Ø, He, Hc, Q2) 
--HAN4) Y^ Oyle) — 14) 


p:Aj(w)<pte 


Notice that FA and E, are strictly decreasing and 
concave in He, He (Ex is actually linear in pe). We 
also define the energy density in the infinite volume 
limit by considering a sequence of increasing cubes. 
For w € QP“, 


1 
e( he, fic, W) = lim — EA(He; bc, w) 
A774 |A| 


The corresponding electronic density is 


; 1 ; 
Pe(He, w) = lim ~ #{ : A; (w) € He} 


4 -t 


O 
FI. Due ^M Ic, 0) [4] 
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and the density of classical particles is 
p-(w) = lim, N,(w)/|A]. One can check that canoni- 
cal and grand-canonical energies are related by 


C(He, He, W) = C( Pe (Me, w), w) 
— He Pe(Hes W) — Me Pc(w) [5] 


Given (He, Hc), the ground-state energy density 
elhe, He) is defined by 


e(He, Hc) = inf e(pe, fc, W) 
we {pps 
The set of periodic ground-state configurations 
for given chemical potentials jig, jie is the grand- 
canonical ground-state phase diagram: 


Goc( He; Lc) = tu € QP t e( Me, Lc, w) = e(He, He) } 


It may happen that no periodic configuration 
minimizes (je, ficsw) and that Gge(ple, fe) — 0. 
However, results suggest that Gogc(fle, fic) is 
nonempty for almost all jie, He- 

The situation simplifies for U » 4d and ye € 
(-U+2d, —2d). Since pe belongs to the gap of 
hy(w), we have pelhe, w) = p(w), and 


e( he, Ue, W) = e(pc(w), 0) — (He + He) Pc(w) 


Thus, Ggc(He, He) is invariant along the line pe + He = 
const. (for He in the gap). 


Symmetries of the Model 


The Hamiltonian H4 clearly has the symmetries of 
the lattice (for a box with periodic boundary 
conditions, there is invariance under translations, 
rotations by 90°, and reflections through an axis). 
More important, it also possesses particle-hole 
symmetries and these are useful since they allow us 
to restrict investigations to positive U and to certain 
domains of densities or chemical potentials 
(see below). 


e The classical — particle-hole 
Wy ++, = 1 —w, results in 


transformation 


HV (©) = Ha (w) — UNA 


and N,(®)=|A|—N,(w). It follows that 
EU(N., 0) = EU (Ne, w) — UNe, and 


Gan (Pe; Pc) = t ue Gaps 1 — pe) } 
Ge (then ble) = {2 WE Got (He 一 U, -u)] 


€ An electron-hole transformation can be defined 
via the unitary transformation a,++¢,a! and 


a! > £ydy, where ex = 1 on a sublattice, and = —1 
on the other sublattice. Then, 


HY (w) + H4" (w) — UN, (w) 


and Nat+|A|—Ny. It follows that EU(|A| — 
Ne, w) = Ex” (Ne, w) — UN: (w), and 


Ge ovs pc) — G- (1] coe Des Dc) 
Ge, dic) = G (Hes He p U) 


è Finally, the particle-hole transformation for 
both the classical particles and the electrons 
gives 


HY (w) + HY (w) + UN, + UN,(w) — U|A| 
It follows that 
EX (|A| —Ne, W) = ET (Ne, w) + U(Ne + Ne(w) — JAJ) 


and 
Ge = 40: w E Go, (1 — pe, 1 — 
can (Pe, Pc) = 44W: wE AN pe: pc) 
GE (Les He) = lo : € GZ (—ue mile = U)| 


Any of the first two symmetries allow us to choose 
the sign of U. We assume from now on that U > 0. The 
third symmetry indicates that the phase diagrams have 
a point of central symmetry, given by pe = p, = 1/2 in 
the canonical ensemble and j4 = = —U/2 in the 
grand-canonical ensemble. Consequently, it is enough 
to study densities satisfying pe < 1/2 and chemical 
potentials satisfying jj, < —U/2. 

These symmetries also have useful consequences at 
positive temperatures. In particular, both species of 
particles have average density 1/2 at pe = ji. = —U/2, 
for all 5. 


The Ground State - Arbitrary Dimensions 
The Segregated State 


What follows is best understood in the limit 
U — oo and when pe < pe. In this case, the electrons 
become localized in the domain Da(w)= {x€ 
A:w,=1} and their energy per site is that of the 
full configuration, e(p,w=1) (see the section 
“Canonical ensemble"), where p=pe/pe is the 
effective electronic density. The presence of a 
boundary for D,(w) raises the energy and the 
correction is roughly proportional to 


Ba(w) = #4 (x,y) :x € Da(w) and y € Z7 \ Da(w)} 


The following theorem was proposed by Freericks 
et al. (2002). 


Theorem 1 


(i) Let A C Zê bea finite box, and U > 4d. Then 
for all we Q4, and all Ne € N,(wo)- Ne, we 
have the following upper and lower bounds: 


des w=0) 
-人 [的 -al 


Here, a(p) —a(1 — p) is strictly positive for 0 < 
p <1. ^(U) behaves as 8d*/U for large U, in 
the sense that Uy(U) 一 8d as U — oc. 

(ii) For any pe # pe that differ from zero, the 
segregated state is the unique ground state if 
a(pe/Pc) > Y(U), that is, if U is large enough. 


BA(w) > Ea(Ne,w) 


The proof of (i) is rather lengthy and we only 
show here that it implies (i). Let b(w)= 
lim, (By(w)/|A|), and notice that b(w)=0 for the 
empty, the full, and the segregated configurations; 
0 < b(w) « d for all other periodic configurations 
or mixtures. Recall that p.e(pe/pe,w=1) is the 
energy density of the segregated state. For all 
densities such that a(pe./p.) > y(U), and all config- 
urations such that pe(w) — pe, we have 


e(pe,w) = ne( ^. = 1) 


and the inequality is strict for any periodic config- 
uration. This shows that the segregated configura- 
tion is the unique ground state. 


General Properties of the Grand-Canonical 
Phase Diagram 


We have already seen that the grand-canonical 
ap diagram is symmetric with respect to 
— U/2, —U/2). Other properties follow from 
Miet of e( le, He). 
Let we Gecl Hes Hc )\Gge( Mes p.) 
(Hes He) NGge (Hes He). Then, 


(a) fe = He and He > fe imply p(w") > pe(w); 

(b) uc = and He > pe imply pe(ui, o) > pelhe c), 
and w cannot be obtained by adding some 
classical particles to the configuration w. 


and w' € Gy 


It follows from (b) that if w = 1 € Ggc(He, He), then 
w z 1€ Ggu., uL) for all pe > uL, He > pL. A simi- 
lar property holds for the empty configuration. To 
establish these properties, we can start from 


m e( He, fc, Q2) > 0 
J = epus uos ua) [6] 


e( fle, bc, Ww ) 
> elhe Hes W 
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Since e(jle, fic; W) is concave with respect to ple and 
linear with respect to He, we have 


e( He Mes W) < epe, Me, W) 
+ dis — Ha) Pe( fe. w) T (Lic = Hc) Pc(w) [7] 


Using this inequality for both terms on the right- 
hand side of [6], we obtain the inequality 


(He — He) [0e (1o ^) — Pe(Hes w) | 
à ot (H T Lc) [Pc (w^) = p(w) = 0 


which proves (a) and the first part of (b). The second 
part of (b) follows from 


Me 


dy pel, w) — fe Pe(w) 


—OO0 


(He; Me, w) dnd 


Indeed, the minimax principle implies that eigenvalues 
Aj(w) are decreasing with respect to w (if U > 0), so 
that pe( fle, w) is increasing (with respect to w). Then for 
any w” > wand ji. > He, 


elles Hc, w) _ e(u., He, w) 
> (Le, He; w) G e( Le, He, w) 


and wÉGgc(He, He) implies w éG gel Hh, He). 

Next, we discuss domains in the plane of 
chemical potentials where the empty, full, and 
chessboard configurations have minimum energy 
(see, e.g., Gruber and Macris (1996), and references 
therein). One easily sees that w = 1 is the unique 
ground-state configuration if jj > 0, or if He > 2d 
and ue > —U. Similarly, w = 0 is the unique ground 
state if ue < —U, or if we < —U — 2d and y. « 0. 
For U > 4d, it follows from the expansion [1] that 
the full configuration is also ground state if —U + 
2d < pe < d and pet+pe+U > 4d/(U — 4d). 


These domains can be rigorously extended using 
energy estimates that involve correlation functions 
of classical particles. The results are illustrated in 
Figures 2 (U < 4d) and 3 (U > 4d). 


Figure 2 Grand-canonical ground-state phase diagram for 
U < 4d. Domains for the empty, chessboard, and full configurations, 
are denoted in light gray, black, and dark gray, respectively. 
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Figure 3 Grand-canonical ground-state phase diagram 
for U>4d. Domains for the empty, chessboard, and full 
configurations are denoted in light gray, black, and dark gray, 
respectively. 


Finally, canonical and grand-canonical phase 
diagrams are related by the following properties: 


(c) If o € Gac(He; fic), then wE Gean(pe(Hes w), pc(w)). 

(d) More generally, suppose that w'),...,w!”) € 
Ggc(He, He), and consider a mixture with coeffi- 
cients 041,...,0,. The mixture belongs to 
Gean(Pes Pc), With pe= » 7; Qjpe(pe,w) and 
pc — > ajpe(w). 


To establish (c), observe that any w satisfies 
elhe, Jc, 04) > elfes hew) if we Gec( Hes ji). Let 
Pe = Pel he, w) and pc = p(w), and let 1. be such that 
Pel [es 4) = pe- By eqns [5] and [7], 


€(pe(He, W'), ) — He Pe( te, o) — He Pe(w’) 
> é(pe(ue, w), w) — Hepe fe, Ww) — Ue Pc(w) 


Then, e(p.,u”) > e(pe,w) for any configuration w 
such that pc(w') = pe. Property (d) follows from (c) by 
a limiting argument, because a mixture can be 
approximated by a sequence of periodic 
configurations. 

Next we describe further properties of the phase 
diagrams that are specific to dimensions 1 and 2. 


Ground-State Configurations - Dimensiori 1 


A large number of investigations, either analytical or 
numerical, have been devoted to the study of the 
ground-state configurations in one dimension. One- 
dimensional results also serve as guide to higher 
dimensions. Recall that symmetries allow us to 
restrict to U > 0 and p, < 1/2. 

Most ground-state configurations that appear in 
the canonical phase diagram seem to be given by an 
intriguing formula, which we now describe. Let 


pe—p/q with p relatively prime to q. Then 
corresponding periodic ground-state configurations 
have period q and density p. — r/q (r is an integer). 


The occupied sites in the cell (0,1,...,4 — 1] are 
given by the solutions ko, ..., b, 4 of 
(pk) —jmodq, O<j<r-1 [8] 


Note that the first classical particle is located at 
ko =0, and ko, --.,kp-1 are not in increasing order. In 
order to discuss the solutions of [8], we introduce 
£=\|q/p| (the integer part of q/p), and we write 


q=(€+1)p-—s i9] 


where 1 € s € p — 1, and s is relatively prime to p. 
Next, let L(x) denote the distance between the particle 
at x and the one immediately preceding it (to the left). 

Let us observe that if p. = pe, that is, if r= p, then 


(a) L(kj) =£ for 0 <j <s- 1and kj — l= kjtp-s 
(b) L(k) 2£--1fors €j X p — and kj — ((- 1)— 
Bis. 


Indeed, for pk; — j + nq, eqn [9] implies 
p(kj — £) — jt (n— 1)q c (p s) - j c p — smodq 


and 
p(k;—-(—1)—j—smodq 


Therefore, k; —/ is a solution of [8] if j--p—s € 
p—1, while b; —(£4- 1) is a solution of [8] if 
j-s20. 

These two properties show that the configuration 
defined by [8] is such that L(x) € {£,£+ 1} for all 
occupied x. A periodic configuration such that all 
distances between consecutive particles are either / 
or £+1 is called homogeneous. Let w be a 
homogeneous configuration with period 4 and 
density p.—r/q, and let xo < ---< xy. 4 be the 
occupied sites in (0, 1,...,4 — 1}. We introduce the 
derivative w of w as the periodic configuration with 
period r defined by (see Figure 4) 


TER E. 


A configuration is most homogeneous if it can be 
“differentiated” repeatedly until the empty or the 
full configuration is obtained. 

Let w be the homogeneous configuration from [8] 
and w be its derivative. Using the same arguments as 
for properties (a) and (b) above, and the fact that s is 
relatively prime to p, we obtain 


(c) Let Ris- 


if L(x;) = 4 
if L(x;) - £41 


Kk , be the solutions of 


(sk;) = jmod p 
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k9z 0 k,=4 k;=7 ks - 11 k5-14 kg - 18 ka - 21 
uw: @ O a ED O € 
k,=0 局 = Ki =2 ki=3 k=4 k=5 k=6 


Figure 4 The configuration w given by the formula [8] with g=24 and p — 7, and its derivative w’. Notice that £=3 and s— 4. 


Then (&5...,k, ,) is a permutation of 
(0, 1,...,p — 1). Further, &; — 1-5, for 0 € 
j<s-—l,andk;-1=k._, fors<j<p-—1. 


Consider the periodic configuration with period p 
where sites kp,...,k,., are occupied and sites 
kis... Ry , are empty. Since ky — 0, this configura- 
tion is precisely the derivative w of w. Iterating, 
these properties prove that the solutions of [8] are 
most homogeneous. 

One of the most important results in one dimen- 
sion is that only most homogeneous configurations 
are present in the canonical phase diagram, for U 


large enough and for equal densities pe = pe. 


Theorem 2 Suppose that pe=pe=p/q. There 
exists a constant c such that for U > c41, the only 
ground-state configuration is the most homogeneous 
configuration, given by [8] (together with transla- 
tions and reflections). 


This theorem was established using the expansion [1] 
of E,(Ne,w) in powers of U^. It suggests a devil’s 
staircase structure with infinitely many domains. 
However, the number of domains for fixed U could 
still be finite. Results from Theorem 2 are illu- 
strated in Figure 5. Notice that pe = p, when ye is in 
the universal gap. These results have been extended 


Figure 5 Grand-canonical ground-state phase diagram in one 
dimension for U > 4 and pe in the universal gap. Chessboard 
configurations occur in the black domain. Dark gray oblique 
domains correspond to densities 1/5, 1/4, 1/3, 2/3, 3/4, 4/5. Total 
width of these domains is of order U7. 


to positive temperatures by using “quantum 
Pirogov-Sinai theory" (Datta et al. 1999). 

For small U, on the other hand, one can use a 
(nonrigorous) Wigner-Brillouin degenerate pertur- 
bation theory (a standard tool in band theory). 
Let o, —p/q with p relatively prime to q, and w be 
a periodic configuration with period ng,n eN. 
Then for U small enough (U < 1/q), we obtain 
the following expansion for the ground-state energy 
(Freericks et al. 1996): 


- 
elpe, w) = — -sin Pe — Upepc(w) 


~ 2 
= (pe )| 
47 sin Te 


U*|log U] + O(U*) [10] 


where w(pe) is the “structure factor" of the periodic 
configuration w, namely 


nq—1 


a 1 eee 
Gp.) = i p» e 2mipel yy, 
j=0 


This expansion suggests that the ground-state 
configuration can be found by maximizing the 
structure factor. The following theorem holds 
independently of U. 


Theorem 3 Let pe — p/q. There exist rı > q/4 and 
r2 <3q/4 such that the configurations maximizing 
the structure factor are given as follows: 


(i) for pe=r/q_ with 
formula [8]; 

(ii) for pe € (r/q, (r+ 1/q)) with ry € r € ra — 1, the 
configuration is a mixture of those for pe —r/q and 
pc — (r + 1)/q; and 

(iii) for p, € (0,r1/q), the configurations are mixtures 
of w=0 and that for pe=rı/q. For p:€ 
(r2/q, 1), the configurations are mixtures of w= 
1 and that for pe — rq. 


ri <r<r, use the 


Some insight for low densities is provided by 
computing the energy of just one classical particle 
and one electron on the infinite line, and to compare 
it with two consecutive classical particles and two 
electrons. It turns out that the former is more 
favorable than the latter for U » 2/3 ~ 1.15, 
while *molecules" of two particles are forming 
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Figure 6 Grand-canonical ground-state phase diagram for 
U = 0.4. Enlarged are domains for pe — 1/7 and 2/7, with the 
same densities p; = 2/7, 3/7,4/7. 


when U «2/3. Smaller U shows even bigger 
molecules for pe =ĦMnpe, and n-molecules are most 
homogeneously distributed according to the for- 
mula [8]. It should be stressed that the canonical 
ground state cannot be periodic if U is small and 
pc € [1/4, 3/4], which is different from the case of 
large U. 

Only numerical results are available for 
intermediate U. They suggest that configurations 
occurring in the phase diagram are essentially given 
by Theorem 3 (together with the segregated config- 
uration). This is sketched in Figure 6, where bold 
coexistence lines for He > —U — 2 and He < 2 repre- 
sent segregated states. 


Ground-State Configurations - Dimension 2 


We discuss the canonical ensemble only, but many 
results extend to the grand-canonical ensemble. 
Recall that G4,(1/2, 1/2) consists of the two 
chessboard configurations for any U > 0, and 
that segregation takes place when pe Æ pe, provid- 
ing U is large enough (Theorem 1). Other results 
deal with the case of equal densities, and for U 
large enough (see Haller and Kennedy (2001), and 
references therein). 


Theorem 4 Let p, — p, = p < 1/2. 


(i) If 
.[12112121 
2.393 3'4 9 85^ M^ 


then for U large enough, tbe ground-state 
configurations are tbose displayed in Figure 7. 
If p=1/(n* + (n+ 1) with integer n, then for 
U large enough (depending on p), the ground- 
state configurations are periodic. 

If p is a rational number between 1/3 and 2/5, 
then for U large enough (depending on the 


—" 


(ii 


Figure 7 Ground-state configurations for several densities. 
Occupied sites are denoted by black circles, empty sites by 
white circles. Lines are present only to clarify the patterns. 


Figure 8 Canonical ground-state phase diagram in two 
dimensions for U > 8. 


denominator of p), the ground-state configura- 
tions are periodic. Further, the restriction to 
any horizontal line is a one-dimensional peri- 
odic configuration given by [8], and the config- 
uration is constant in either the direction (1) 
OF (3). 

Suppose that U is large enough. If pe 
(1/6,2/11), the ground-state configurations are 
mixtures of the configurations p=1/6 and 
p=2/11 of Figure 7. If p€(1/5,2/9), the 
ground-state configurations are mixtures of the 
configurations p=1/5 and p=2/9. If pe 
(2/9,1/4), the ground-state configurations are 
mixtures of the configurations p=2/9 and 
p=1/4. 


The canonical phase diagram for p. = pe is presented 
in Figure 8. 

The situation for densities p < 1/2 that are not 
mentioned in Theorem 4 is unknown. All these 
periodic configurations are present in the grand- 
canonical phase diagram as well. Theorem 4(ii) 
suggests that the two-dimensional situation is similar 
to the one-dimensional one where a devil's staircase 
structure may occur. Let us stress that no periodic 
configurations occur for large U and densities pe = pe 
in the intervals (1/6, 2/11), (1/5, 2/9), and (2/9, 1/4). 
This resembles the one-dimensional situation, but for 
small U. 


— 


(iii 


See also: Quantum Spin Systems; Quantum Statistical 
Mechanics: Overview; Fermionic Systems; Hubbard 
Model; Pirogov—Sinai Theory. 
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Introduction 


On the one hand, quantum mechanics and classical 
mechanics appear to be formulated within quite 
different mathematical frameworks, that is, in terms 
of Hilbert spaces and operators on Hilbert spaces on 
the quantum side and in terms of phase spaces, that 
is, symplectic, or more generally, Poisson manifolds, 
and functions on these phase spaces on the classical 
side. On the other hand, there is a strong structural 
similarity between the algebras of observable quan- 
tities in both theories which are associative *-algebras 
over C. In the classical case, the algebra is commu- 
tative the product being the pointwise product of 
functions on the phase space and is endowed with the 
additional structure of a Poisson bracket by means of 
which the dynamics of the system can be formulated. 
In the quantum case, the algebra is the noncommu- 
tative composition of operators on a Hilbert space 
and the dynamics is determined by the corresponding 
commutator. The difference between functions on a 
phase space and the operators on a Hilbert space 
constitutes the main difficulty for the passage from a 
classical theory to the corresponding quantum theory 
which would be desirable, since a formulation of the 
more fundamental but much less intuitive quantum 
theory is often impossible. Even the consideration of 
the classical limit leads to the same problem of 
comparing quite different mathematical objects. One 
possibility, which is the basic idea of deformation 
quantization, to avoid these problems is to pass from 
classical observables to quantum observables not by 


Fedosov Quantization 291 


Freericks JK and Zlatić V (2003) Exact dynamical mean-field 
theory of the Falicov-Kimball model. Reviews of Modern 
Physics 75: 1333-1382. 

Gruber Ch and Macris N (1996) The Falicov-Kimball model: a 
review of exact results and extensions. Helvetica Physica Acta 
69: 850-907. 

Haller K and Kennedy T (2001) Periodic ground states in the 
neutral Falicov-Kimball model in two dimensions. Journal of 
Statistical Physics 102: 15-34. 

Jedrzejewski J and Lemański R (2001) Falicov-Kimball models of 
collective phenomena in solids (a concise guide). Acta Physica 
Polonica 32: 3243-3251. 

Kennedy T and Lieb EH (1986) An itinerant electron model with 
crystalline or magnetic long range order. Physica A 138: 
320-358. 


changing the underlying vector space, but only by 
deforming the algebraic structures namely the asso- 
ciative product and possibly the *-involution. 

This idea motivates the following definition of a 
star product by Bayen et al. (1978), which reassem- 
bles the minimal demands made on a suitable 
quantization: 


Definition 1 A star product on a Poisson manifold 
(M,II) is an associative C[[v]]-bilinear product « on 
C* (M)[[v]] such that- writing f * g= > o7 C,(f,g) 
for f, g € C* (M) with C-bilinear maps C, with values 
in C* (M) - the following properties hold: 


(11) Ci (f, g) = C; (g, f) = (f. ], and 
HO) tv fataf ad. 


In case the C-bilinear maps C, are differential 
operators, the star product is called differential. If 
Í*g-—g»*f,then x is called Hermitian. 


The conditions (i) and (ii) express the correspon- 
dence principle in deformation quantization and in 
case the star product converges the formal para- 
meter is to be identified with ib, whence we set 
v — —v considering the formal parameter as purely 
imaginary. Since the Fedosov star products we are 
going to study in the sequel are differential, we shall 
drop stressing this property explicitly and refer to 
differential star products as star products, merely. 

One main advantage of deformation quantization 
is that one has the following very general existence 
result: 


Theorem 1 On every Poisson manifold (M,IT) 
there exist (even differential) star products. 


This theorem was first shown by DeWilde and 
Lecomte (1983) for the symplectic case and 
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independently by Fedosov (1985) who gave a 
beautiful explicit construction using geometrical 
structures on (M,w) to build a star product 
recursively. Omori et al. (1991) gave yet another 
existence proof of star products on a symplectic 
manifold (M,w) that appears to combine 
the methods of DeWilde and Lecomte (1983) 
and Fedosov (1985). The general proof of existence 
on general Poisson manifolds is due to Kontsevich 
(2003) and is a consequence of Kontsevich’s 
formality theorem. 

If S=id+ 377 , v’S, is a formal series of differ- 
ential operators on C*(M) with $,1—0 for r> 1, 
then 


fx g := S (Sf) * (Sg) [1] 


again defines a star product. Clearly, x’ is Hermitian 
in case * is Hermitian and Sf=Sf for all 
f € C*(M)[[v]]. The above observation of the shape 
of certain isomorphisms between star product 
algebras gives rise to the notion of equivalence of 
star products: 


Definition 2 Two star products x and * on (M, II) 
are called equivalent in case there is a formal series 
S—id--»7 ,uv’S, of differential operators on 
C*(M) with $,1—0 for r » 1 such that eqn [1] is 
satisfied for all f, g € C*(M)[[v]]. 


The full classification of star products up to 
equivalence was first obtained in the symplectic case 
by Nest and Tsygan (1995) and independently by 
Deligne (1995) and Bertelson e£ al. (1997). The 
general Poisson case again follows from Kontsevich's 
formality theorem. In particular, in the symplectic 
case, star products are classified in a functorial way 
by the characteristic class 
| ie] | qp 
cix) E Hiran MX] — (2 
defined by Deligne that induces a bijection 
between the equivalence classes of star products 
and [w]/v + Hs rhan M)III. Moreover, it has been 
shown (Bertelson et al. 1997, Deligne 1995, Nest 
and Tsygan 1995) that every star product on .a 
symplectic manifold is equivalent to a Fedosov star 
product. This fact can also be seen as a direct 
consequence of the explicit computation of the 
characteristic class of a Fedosov star product 
(cf. Neumaier (2002)). The importance of Fedosov's 
construction for the general theory of deformation 
quantization in the symplectic case is also shown by 
the fact that in many proofs Fedosov's star 
products were used to have reference star products 
to compare with a given star product. Moreover, 


there is a great variety of modifications and 
generalizations of Fedosov's method and there are 
many examples where additional structures on the 
symplectic manifold suggest to look for star 
products adapted to them, where modified 
Fedosov constructions can be applied successfully. 


Fedosov Star Products on (M, w) 


The attempt to construct a star product step by step 
in fact leads to a cohomological problem, where 
a priori an obstruction in the third Hochschild 
cohomology of C*(M) occurs. This problem results 
from the demand for associativity which is the really 
most restricting condition on a star product. There- 
fore, additional arguments are necessary to show 
that these obstructions can be circumvented, since 
the concerning cohomology is isomorphic to 
T A TM) and hence, for dim(M) » 3, is not 
trivial at all. 

The basic strategy of Fedosov's construction to 
build in associativity of the resulting product is 
to begin with a “very large" associative algebra 
(W&A,o), where o mimicks the well-known 
Weyl-Moyal star product on a vector space with a 
constant symplectic Poisson tensor, and to specify a 
suitable subalgebra which is in bijection to 
C*(M)[[v]]] Pulling back the product to the sub- 
algebra then clearly results in an associative product 
on C*(M)|[v]], but as we shall see later on, one has 
to care for the bijection to be sufficiently nontrivial 
in order to obtain in fact a nontrivial deformation of 
the usual pointwise product on C*(M)|[[v]]. 

Defining 


We | X re (V' TM ® AT [加 [3] 


W & A becomes in a natural way an associative, 
supercommutative algebra using the symmetric 
V-product in the first factor and the antisymmetric 
^-product in the second factor. This product is 
denoted by p(a@®b)=ab for a,b c Y & ^A. By 
W & A* we denote the elements of antisymmetric 
degree k and set W := W & A". Besides this pointwise 
product, the Poisson tensor II corresponding to w gives 
rise to another associative product o on W & A by 


aob-—u (exp 5 Hi (0;) ® (à) (a ® b)) [4] 


which is a deformation of u. Here is(Y) denotes the 
symmetric insertion of a vector field Y € l*(TM) 
and similarly i,(Y) shall be used to denote the 
antisymmetric insertion of a vector field. We set 
ad(a)b:— [a,b], where the latter denotes the 


deg,-graded supercommutator with respect to o. 
Denoting the obvious degree maps by deg.,deg,, 
and deg,=v0,, one observes that they all are 
derivations with respect to j but deg, and deg, fail 
to be derivations with respect to o. Instead, 
Deg :— deg, + 2deg, is a derivation of o and hence 
(W@A, o) is formally Deg-graded and the corre- 
sponding degree is referred to as the total degree. 
Sometimes we write W; & A to denote the elements 
of total degree >k. The total degree can be used to 
define an ultrametric d on W & A and it is known 
that (W & A,d) is complete, which implies that 
Banach's fixed-point theorem can be applied in this 
setting. This observation is important since all the 
proofs of existence and uniqueness of certain 
elements in W & A we shall construct in the sequel 
can be reduced to the application of this theorem. 
In local coordinates, we define the differential 


6:— (1 @ dx’)i,(d;) [5] 


which satisfies 6^ — 0 and is a superderivation of o. 
Evaluated at a point m € M, the product a(mm)b(m) 
of two elements a,b € W & A can be considered as 
the A-product of two differential forms with poly- 
nomial coefficient functions on the vector space 
T,,M. Interpreted this way, the restriction of 6 to the 
fiber at m is nothing but the exterior derivative of 
differential forms with polynomial coefficients. 
Hence, it is clear that there is a homotopy operator 
6! satisfying 


66 | --6 !6-- o — id [6] 


where o: W $9 A —C"(M)[[v]] denotes the projec- 
tion onto the part of symmetric and antisymmetric 
degree 0. With the above view of 6, this is just the 
Poincaré Lemma for differential forms with poly- 
nomial coefficients, which says that all the coho- 
mology spaces vanish except for the one of degree 0 
and the cohomology in degree 0 is just given by the 
constant functions on the vector space T,,M. This 
means that the -cohomology on W & A is trivial 
except for the space of degree 0, which is given by 
the formal functions on M. For computational 
purposes, it is useful to have a concrete formula of 
the homotopy operator 6^! which is given by 


— (del ®1)i,(0;)a for deg, a = ka, 
g-ig u k 十 l [7] 
deg,a = la withk+140 
O else 


Now ker(6) 1 W=C™*(M)|[v]] and one might 
wonder whether this subalgebra of (W@ A, o) is 
already suitable to induce a deformed product on 
C™*(M)|[v]] by pulling back the product o from W & A. 
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Evidently, the answer to the question is negative 
since the resulting product just gives back the 
undeformed pointwise product of formal functions 
on M. Hence one has to find a less trivial 
superderivation of the product o the kernel of 
which is still in bijection to C" (M)[[v]] The 
essential new component of Fedosov's construction 
is a superderivation of (W@A,o) that is not 
C*(M)l|vy]]-Iinear and hence in a certain sense 
generates derivatives along the base manifold M. 
Using a torsion-free symplectic connection V on M 
we define an endomorphism also denoted by V of 
W & A by 


V:=(18 dx)Va [8] 
which turns out to be a superderivation of o due to 
the fact that Vw = VII = 0. The map V satisfies the 
identities 

[6, V] = 0, 


since the connection is torsion-free [9] 


V= -—ad(R), 
where R := 10i Ry dx! v dx! &dx* A dx! E WQ A? [10] 


involves the curvature of the connection. Moreover, 
we have 


6R=0=VR [11] 


by the Bianchi identities. 

Now one could consider the superderivation —ó + V 
of (W & A, o) and try to define a mapping 7 from 
C* (M)[[v]] to ker(—6 + V) NW such that o(7(f)) =f 
for all f € C*(M)[[v]]. But in case the curvature of 
the connection does not vanish, the necessary 
condition for the solvability of the equation (一 6 + V) 
r(f)=0 subject to the additional condition 
c(T(f)) —f is not satisfied. Only in case there is a 
torsion-free symplectic connection on M with 
vanishing curvature, this procedure can be carried 
through and yields again the Weyl-Moyal star 
product since the fact that V is symplectic in this 
case implies that the components of the Poisson 
tensor are constant. However, in general, the kernel 
of —ó + V does not have the desired properties to 
specify a suitable subalgebra of (W & A, o ) and one 
makes the ansatz 


D= -6+ V —Žad(r) [12] 


with an element r € W3 & A! for a suitable super- 
derivation. Now a direct computation yields that 


9 =tad(ór-Vr+Žror-R) [13] 
V V 
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which vanishes iff 6r—Vr+(1/v)ror—R_ is a 
central element in W>@A*. This is the case iff 
there is a formal series of 2-forms Q € vI'™ 


( A7 T* M)[[v]] with 


1 
f= yrker = 1 ee [14] 
After these preparations, one is in the position to 
prove the following theorem: 


Theorem 2 (Fedosov 1994, theorem 3.2; Fedosov 
1996, theorem 5.2.2). For every formal series 
Qe yr ( A* T*M)[[v]] of closed 2-forms there 
exists a unique element r € W3 & A! such that 


ores Vr-Srord R80 and &'r=0 [15 
V 


Moreover, r satisfies 
1 
ri (Belen vr- ror) [16] 


from which r can be determined recursively. In this 
case the Fedosov derivation 


D=- 64 V — ^ ad(r) 17 


is a superderivation of antisymmetric degree 1 and 
has square zero: DÙ — 0. 


For obvious reasons Fedosov calls D a flat or 
abelian connection for the bundle W & A and w + Qis 
referred to as the central Weyl curvature of the 
connection D. In some sense, the flatness property 
D? = 0 guarantees that there are sufficiently many flat 
sections. Before investigating the structure of 
ker (D) NW we note that the D-cohomology is trivial 
on elements a with positive antisymmetric degree 
since one has the following homotopy formula: 


DD 'a+D'Da=a 


where 


— -1 : 
D'a:=-6 (awa) 


(cf. Fedosov (1996, theorem 5.2.5)). The reason for 
this fact, which is also the crucial point for the proof 
of Theorem 1, is the property of the -cohomology 
to vanish except for the cohomology space of 
degree 0. 

The next step in Fedosov’s construction now 
consists in establishing a bijection between the flat 
sections a € W, that is, those elements of W with 
Da=0, and C*(M)[[v]]. 


Theorem 3 (Fedosov 1994, theorem 3.3, Fedosov 
1996, theorem 5.2.4). Let D= —ó-4-V —(1/v) 


[18] 


ad(r) :W & A —^W & A be given as in [17] with r as 
in |15]. 


(i) Then for any f € C*(M)[[v]] there exists a 
unique element T(f) € ker (D) MW such that 


o(r(f)) =f [19] 


and T :C* (M)[[v]] ^ ker (D) N W is C[[v]]Hinear 
and referred to as the Fedosov-Taylor series 
corresponding to 3). 

(ii) In addition, T(f) can be obtained recursively for 
f € C*(M) from 


1 
(feet (vef)- rade) Ba 
Using 3) according to [18] one can also write 


r(f) =f - 9'(1& df) 
for all f € C*(M)[[v] 21] 


(iii) Since 9 as constructed above is a o-superderivation, 
ker(D)NW is a o-subalgebra and a new 
associative product x for C" (M)[[v]], which 
turns out to be a star product, is defined by 
pullback of o via rT: 


f * g:—o(r(f) o v(g)) [22] 


In the following, we shall refer to the associative 
product * defined above as the Fedosov star product 
corresponding to (V, Q). The choice of the formal 
series of closed 2-forms €? in fact has a crucial effect 
on the equivalence class of the resulting star 
product, whereas the choice of the torsion-free 
symplectic connection, which in contrast to a 
Riemannian connection is not unique, does not 
affect this class. This observation has been the 
main step in all the proofs of the classification 
results in deformation quantization of symplectic 
manifolds. Another way to prove this fact is to 
compute the characteristic class c(*) introduced by 
Deligne (1995) using the methods developed in Gutt 
and Rawnsley (1999) directly which yields: 


Theorem 4 (Neumaier 2002, theorem 2). Deligne’s 
characteristic class c(*) of a Fedosov star product * as 
constructed above is given by 


c) == fu] += [9 23 


The properties of Q with respect to complex 
conjugation also decide on whether * is Hermitian 
or not. In case Q is real, that is, satisfies () — €) it is 
easy to show - observing that a o b — (—1)" b o à for 
acwWweA*,bewaeM - that F solves the equa- 
tions that uniquely determine r and hence 7 — r. But 
then D commutes with complex conjugation and 


therefore the unique characterization of the Fedosov— 
Taylor series yields 7(f) — 7(f) for all f € C* (M)[[v]], 
implying that * is Hermitian. 


Derivations, Automorphisms, 
and Equivalence Transformations 


Having defined the Fedosov star product * correspond- 
ing to (V, Q), the next logical step is to investigate the 
structure of its derivations and automorphisms and to 
find out how they can be described in the framework of 
Fedosov’s construction. In addition, one can ask for an 
explicit construction of equivalence transformations 
between two Fedosov star products * and *’ obtained 
from (V, €) and (V',O') that exist according to 
Theorem 4 iff [Q] = [Q]. 

Since the basic philosophy of Fedosov's construc- 
tion is to consider suitable operations on the algebra 
(W & A, o) in order to obtain induced mappings on 
the level of (C*(M)[[v]], x), one may expect to be 
able to define derivations of (C*(M)[[v]], *) by 
considering appropriate fiberwise quasi-inner 
derivations of (W & A, o) of the shape 


Hy ~ad(h) 24) 


where b € W and without loss of generality we 
assume o(b) — 0. Our aim is to define C[[v]]-linear 
derivations of * by C*(M)[[v]] > f — e(D,T(f)), but 
for an arbitrary element h € W with o(h)=0 this 
mapping fails to be a derivation as Dj does not map 
elements of ker (D) AW to elements of ker (D) A W. 
In order to achieve this, the supercommutator of D 
and D, has to vanish. As Ð is a C[[v]]-linear 
o-superderivation, we obviously have 


D, D,] = — ~ad(Dh) 25) 


and hence obviously Dh must be central, that is, Dh 
has to be of the shape 1 ® B with B € P*(T* M)[[v]] 
to have [9, D,] 2 0. From 4$» —0, we get that the 
necessary condition for the solvability of the 
equation $95 —1 &®B is the closedness of B since 
$(1&B)—1&dB. But as the D-cohomology is 
trivial on elements with positive antisymmetric 
degree, this condition is also sufficient for the 
solvability of the equation 95 —1 & B and we get 
the following statement. — 


Lemma 1 (Müller-Bahns and Neumaier 2004, 


lemma 2.1). 


(i) For all formal series B € T*(T* M)|[v]] of closed 
1-forms on M there is a uniquely determined 
element bg € W such that Dbg=1@B and 
c(bg) — 0. Moreover, bg is explicitly given by 
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bg =D! (1 8 B) [26] 


(ii) For all B € Zi a, (M)I[v]] the mapping Dg :C™ 
(M)[[v]] ^ C* (M)[Iv]], where 


Daf = e(Dyr(f)) = e(- radii) 27 


for f € C*(M)[[v]] defines a C|[v]]-linear derivation 
of * and hence this construction yields a mapping 
Z4erham(M)[[v]] 3 B> Dg € Dercqa( C" (M)[vI], x). 


Furthermore, one can show that one even obtains 
all C[[v]]-linear derivations of * by varying B in the 
derivations Dg constructed above. 


Proposition 1 (Müller-Bahns and Neumaier 2004, 
proposition 2.2). The mapping 


ZherRham(M)|[[v]] B — Dg € Derc (C (M)[[v]], *) 


defined in Lemma 1 is a bijection. Moreover, Dag is 
a quasi-inner derivation for all f € C*(M)[Iv]], that 
is, Dar — (1/v)ad.(f) and the induced mapping 
[B] ^ [Dg] from Hia, M)I[v]] to Derce” 
(M)[Iv]], 3/Der& (C (M), * the space of 
C{|v]|-linear derivations of * modulo tbe quasi- 
inner derivations, also is bijective. 


Actually, it is well known that for an arbitrary star 
product « on a symplectic manifold the space of C[[v]]- 
linear derivations is in bijection with Zl ais 3 M)IEII 
and that the quotient space of these derivations modulo 
the quasi-inner derivations is in bijection with 
Hi nt UM)I[v]] (cf. Bertelson et al. (1997), theorem 
4.2), but the remarkable thing about Fedosov star 
products is that these bijections can be explicitly 
expressed in terms of D resp. D™ in a very lucid way. 

Now we turn to the consideration of C[[v]]-linear 
automorphisms of *. For such automorphisms that 
start with id, which are also called self-equivalences, 
it is known (cf. Gutt and Rawnsley (1999), Proposi- 
tion 3.3) for arbitrary star products x on (M, w) that 
they are of the form 


A = exp(vD) [28] 


with a C[[v]]-linear derivation D of x. Therefore, the 
above result about the description of all the 
derivations of * directly yields a complete descrip- 
tion of all self-equivalences of *. 

The description of C|[v]]-linear automorphisms 
that are not self-equivalences of * is slightly more 
involved and we first need some results about the 
concrete structure of the equivalence transforma- 
tions between two Fedosov star products * and x’. 
To compare two Fedosov star products obtained 
from different torsion-free symplectic connections 
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V and V' and different but cohomologous formal 
series of 2-forms Q and Q’, one has to compare the 
corresponding Fedosov derivations D and ©’. First 
recall some well-known facts about torsion-free 
symplectic connections on (M,w). Given two such 
connections V and V’, it is obvious that 
SV-V(X, Y):- VxY = VY, where X, Y € I*(TM) 
defines a symmetric tensor field SV-V' €e r*«( V 
T'M& TM) on M. Defining oV-V(X,Y,Z):— 
oSv v (X, n it is easy to see that oV V c 


Fee(V T*M) is a totally symmetric tensor field. 
Conversely, given an arbitrary element o € 
Lo (V? T* M) and a symplectic torsion- free connec- 


tion V snd defining $" € r*(V^ T*M@TM) by 
c(X,Y,Z)-—w(S"(X,Y),Z), then V" defined by 
VY :=VxY —S?(X,Y) again is a torsion-free 
symplectic connection and all such connections can 
be obtained this way by varying o. Using these 
relations, one can compare the corresponding 
mappings V and V' on W & A. With the notations 
from above we have 
—(dx! & dx')i (SV 


V—-WV'z = (8, O;)) 


= -ad(TV-V) [29] 


where TV-V e Tp*( Va T'MOT'M)CWq&A! is 


defined by  TV-V(Z,Y;X):-oV-V(X,Y,Z)— 
w(SV-V(X,Y),Z). Moreover, TY~Y satisfies the 
equations 
ap ug [30a] 
and 
v ahs = R! = i pv-v 5 TV" 
E [30b] 


yay eq p. lqv-v 6 qu-w 

V 

am — (1/4) witR*, dx! V V dx/ & dx^ ^ dx! and 
— (1/4) ae ‘kl i V dxf @dx* Adx! denote the 

saladi v elements of W & A? that are built 

from the curvature tensors of V and V". 

Now we are in the position to compare two 
Fedosov derivations D and ©’ resp. the induced star 
products + and * obtained from (V, Q) and (V', Q). 
The idea for the construction of an equivalence 
transformation from * to */ is to look for an 
automorphism A, of (W & A, o ) of the form 


Aj — exp (eaw) such that 9'—.4,9(.4,) ! [31] 


where / is an element of W3 guaranteeing that A; is 
well defined and without loss of generality is assumed 
to satisfy c(5)—0. In case one can find such an 
element b it is clear that A, yields a bijection between 


ker(D) NW and ker($')nW and hence one would 
obtain an equivalence S$, from * to */ defining 


Spf = o(AnT(f)) 


s o(exp C adi) “(f)) 
with inverse 


(Sp) f =o((Ay) ^7 (f) 


= o(exp(—<ad(h))r(/)) [32b] 


A direct computation yields 


ou tt exp((1/v)ad(h)) — id 
ALDA) =D ad (CBS o») 


which is equal to D' iff h has been chosen such that 


[32a] 


| exp( ad(h)) — id 


TW-SE J 
bid Tad(h) 


(Dh)EeWe' [33] 


is a central element. Considering the total degree of 
the terms in this expression, this is the case iff there 
is a formal series of 1-forms Cevwr*(T*M)[[v]] 
such that the expression in eqn [33] equals 1@C. 
Applying © to this equation and using the 
equations that r and 7 satisfy together with the 
relations [30] it is cumbersome but not difficult to 
show that necessarily Q and €» have to be 
cohomologous: 


2-0 2 dC [34] 


with C as above. Now, using [6] one can show that 
this condition is in fact sufficient and moreover one 
can even determine the element / in question 
recursively: 


Theorem 5 (Fedosov 1994, theorem 4.3). Two 
Fedosov star products * and *' obtained from (V,Q) 
and (V',(Y) are equivalent iff €) and XY are 
cobomologous. In case C € vP*(T* M)[[v]] satisfies 
Q — Q' — dC there is a uniquely determined element 


bc € Ws with o(hc)=0 such that 
vv, y_,_ €XP((1/v)ad(bc)) — id E 
A Gadha TOS 


Moreover, hc can be determined recursively from 


hc=C@1+ = Vhc — -ad(r)bc 


(1/v)ad(bc) 


= exp((1/v)ad(bc)) — id v -+ ae 


and with the  so-constructed bc one bas 
Y =A, (An)! and thus Sp. according to eqn [32] 
defines an equivalence transformation from * to *'. 


Evidently, in the above construction of the 
equivalence transformation Sp, there is some choice 
of the formal series of 1-forms C. Different possible 
choices C and C differ by a formal series of closed 
1-forms but choosing C instead of C amounts to 
another equivalence transformation Sp, — A'S,.A 
from * to x', where A and A’ are certain self- 
equivalences of * and +’, respectively. In case €) and 
Q are real, we have seen that * as well as *' are 
Hermitian star products and it is easy to verify that 
choosing a formal series C of 1-forms as above that 
is moreover real yields an element hc satisfying 
bc-—bc. But then it is evident that the resulting 
equivalence transformation is also compatible with 
complex conjugation, that is, $,,/ —S,,/ for all 
f € C" (M)IlI. 

Now we are prepared to give a construction of all 
C[[v]|-inear automorphisms of a Fedosov star 
product *. It is easy to show that any C[[v]]-linear 
automorphism of a star product * on a symplectic 
manifold is the combination of the action of a 
symplectomorphism :: M — M and an equivalence 
between x and the pullback x’ via ^! of x, which is 
defined by f x g= (v! )' ((u*f) * (w*g)) (cf. Gutt and 
Rawnsley (1999 Proposition 9.4)). Since the char- 
acteristic class of *’ is given by c(x’) = (u^! )'c(x), the 
necessary and sufficient condition for a 
symplectomorphism : to define a possible zeroth- 
order term of an automorphism is that (u^ !)'c() 
一 ct since x' and « have to be equivalent. 

Within Fedosov's framework, it can be shown 
that the pullback +’ via a symplectomorphism yw! of 
* is identical to the Fedosov star product obtained 
from (V'a(y Vy, QA =(Y N), which just 
expresses the functoriality of Fedosov's construc- 
tion. Together with Theorem 4 this particularly 
shows that c(*)-—(w!)c(9, and therefore +’ is 
equivalent to * iff Q and €)' differ by a formal series 
dC, of exact 2-forms, where C, € vP*(T* M)[[v]] 
clearly depends on v. But in this situation one can 
apply the construction of equivalence transforma- 
tions between Fedosov star products given in 
Theorem 5 with C replaced by C, and V',€? as 
above yielding an equivalence S5, := Spe from * to +’. 
Finally, we therefore get that the combination 


Ay = Sy, 36] 


is a C|[v]]-linear automorphism of * and it is 
obvious from the above that every such automorph- 
ism can be obtained by considering all symplecto- 
morphisms w of (M, w) satisfying [(¢')*Q] 2 [O] and 
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composing the resulting A, according to [36] with 
all self-equivalences A of * according to [28]. 


Adaptions, Modifications, 
and Generalizations 


The geometrical construction of Fedosov has gone 
through many adaptions and modifications that are 
well suited to the particular geometry of the under- 
lying symplectic manifold. Moreover, there are 
generalizations that go beyond the case of symplec- 
tic manifolds and others that yield more general 
deformations than star products. We just give a few 
important examples that stress the power and 
beauty of Fedosov's construction. 

On a Kahler manifold, one can define the notion 
of star products with separation of variables (cf. 
Karabegov (1996) that are also called star products 
of Wick type (cf. Bordemann and Waldmann (1997) 
and Neumaier (2003). These are star products such 
that in local holomorphic coordinates the bidiffer- 
ential operators C, are of the form 


alkif ally 


OzK Oz. zm 


Cr(f,8) = > CF" 
K,L 


with certain coefficient functions C+. These star 
products can be obtained by a modified Fedosov 
construction starting from the product owie on W & A 
given by 


aowa b=n (eod 0.) ©i(0s)) (a8) [08 


where e" denotes the components of the inverse of 
the Kahler metric in local holomorphic coordinates. 
In the case of a Kahler manifold, there is a 
distinguished torsion-free symplectic connection 
namely the Kahler connection V that induces a 
superderivation of ow; in a way completely 
analogous to [8]. With these structures the Fedosov 
construction works for an arbitrary formal series 
of closed 2-forms as before, but one can show that 
the resulting star product is of Wick type iff Q is of 
type (1, 1) and one can even show that one obtains 
all star products of Wick type by varying 2 
(cf. Neumaier (2003)). 

In the case of an almost-Kahler manifold, one can 
consider a product of on W & A similar to owick 
which is adapted to the almost-complex structure 
(cf. Karabegov and Schlichenmaier (2001)). How- 
ever, in this situation there is no torsion-free 
connection that yields a superderivation of this 
product but only a connection V' with torsion that 
defines such a superderivation. Nevertheless, one 
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can consider a generalized Fedosov construction. To 
this end, one shows that [6, V'] 2 (1/v)ad (T") with 
some T’ € W & A? that satisfies 5T’ — 0 and encodes 
the torsion of V’ and óR'— V'T', where again 
V? — —(1/v)ad'(R') and R', which depends on the 
curvature of V', satisfies V'R' — 0. But then it is easy 
to show that there is a unique element 7’ € W: & A! 
such that 


&— Vo tT HR 4100 


and 
ác =O [39] 


with Q as above, which can also be computed 
recursively. Clearly, D’ = —6 + V’ — (1/v)ad'(r) then 
is a suitable Fedosov derivation with square zero for o' 
and one can proceed as described earlier to obtain a star 
product +’ adapted to the almost-complex structure. 

On a cotangent bundle 7: T*O — O, where T*O 
is equipped with the canonical symplectic form 
wo = —d@, one can consider (cf. Bordemann et al. 
(1998)) the following so-called standard ordered 
product o44 on W & A given by 


Ag b= u (exp (—vi. (à, Bi (Oz: 4 pri T^, O,, )) 
x (a@b)) i40] 


in local Darboux coordinates. Here I", denotes the 
Christoffel symbols of a torsion-free connection V9 
on O in the chart of O corresponding to the bundle 
chart (q,p) and it is straightforward to see that o, 
does not depend on the chosen local coordinates and 
is associative. In the present situation, one can 
define a torsion-free symplectic connection Vi 8 on 
T* Q solely in terms of V9 but then the correspond- 
ing mapping V! 9 on W@A again fails to be a 
superderivation of o44, whereas the combination 
VT? 4 B with B=(v/3)pjn* Ri, (1 ® dq')is (0p, is (0p, ), 
where Ri denotes the components of the curvature 
tensor of V9, turns out to be a suitable super- 
derivation to start the Fedosov construction with 
Og. In fact, the square of V’2+8B turns out to 
equal the square of VY and all the other 
preconditions of Fedosov's construction are easily 
verified just replacing V by V! 9 + B. The particular 
property of the resulting star product «44 for Q=0 
on T* Q is that it is a standard ordered star product, 
that is, for all fec"*(T*O)[[v] and all ye 
C*(O)|[v]] one has 


TX #std = T xf [41] 


and hence *44 in a certain sense is adapted to the 
vertical polarization. 


The methods mentioned so far can even be melted 
into a more general situation, where one considers a 
(complex) polarization on (M,w) and looks for star 
products that are adapted to this polarization which 
are then called polarized deformation quantizations 
(cf. Donin (2003)). Here again a generalization of 
Fedosov's construction yields the existence and the 
classification of such particular star products. 

Another recent generalization of Fedosov's con- 
struction that goes beyond the framework of smooth 
symplectic manifolds is that of the construction of 
star products on symplectic orbispaces (cf. Pflaum 
(2003)), which are stratified symplectic spaces. The 
main idea there is to consider Fedosov's construction 
in local orbicharts and to show that the changes of 
orbicharts induce isomorphisms between the locally 
defined deformation quantizations, implying that the 
locally defined products match together to define a 
global deformation quantization on the symplectic 
orbispace. To achieve this property, one has to adjust 
the local Fedosov constructions appropriately, that is, 
one has to use locally defined torsion-free symplectic 
connections and formal series of closed 2-forms that 
are related by the changes of the orbicharts. 

Considering a vector bundle E — M, the sections 
P(E) are naturally a C*(M)-right module and a 
L^ (End(E))-left module, and it is a natural question 
whether this bimodule structure can be deformed 
such that l'*(E)[[v]] becomes a (C* (M)[[v]], «)-right 
module and a (L^ (End( E))[[v]],*)-left module, where 
* is a deformation of the usual composition of 
elements of l*(End(E)). In order to construct such 
deformations, one can also adapt Fedosov's construc- 
tion (cf. Waldmann (2002)) considering W & AQ 
E=(XX T°*(VT'M®AT*M®SE))|[v]] and WAQ 
&nd(£)-(X7* T° (VT*M®AT*M@End(E)))[[v]] 
and extending the product o to these spaces in 
a natural way making W@A@E a (W@A® 
End(£), o) - (W & A, o)-bimodule. Furthermore, one 
has to consider a connection V^ that naturally induces 
a connection on End(E), and both have to be added 
to V to define the corresponding substitute of V on 
the respective space. Then the Fedosov construction 
with YV& A&£nd(£) can be considered yielding a 
Fedosov derivation 9*9? with square zero, hence 
a Fedosov-Taylor series 7F'* and an associative 
deformation F#G= olrEndP)(F)o7End(G)) of the 
usual composition of sections in the endomorphism 
bundle. Moreover, there is a map $9" on WAGE 
that is a superderivation with respect to the 
bimodule multiplication o along DEYE) and D, 
respectively. This map also has square zero and 
the intersection of its kernel with the elements of 
antisymmetric degree is in bijection to I'*(E)[[v]] via 
a natural generalization 7^ of the Fedosov-Taylor 
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series. Defining F(95s:—o(7 F9 P (F)ozF(s)) and s-f:= 
a(r(s)or(f)), V *(E)([v]] can be given the structure 
of a (LP*(End(E))[v]],$)-(C* (M)[[v]], *)-bimodule 
which is indeed a deformation of the classical 
bimodule structure of T(E). It is rather evident that 
the same procedure also works for other products on 
WA and the above generalizations, in particular for 
the product ow;4, on a Kahler manifold, where one can 
obtain (F^ (End(E))H[v]fwia.)-(C^ (M) [[v]]; * Wick )- 
bimodules that are adapted to the complex structure in 
case the curvature endomorphism of the connection 
VF is of type (1, 1). For example, this holds true for 
(anti-) holomorphic vector bundles endowed with a 
Hermitian fiber metric h and the corresponding 
connection that is compatible with h and the (anti-) 
holomorphic structure. 

Finally, the proof of existence of deformation 
quantizations on arbitrary Poisson manifolds 
(M,ID, that includes a concrete construction 
starting from Kontsevich’s star product on the 
flat space R" equipped with a Poisson tensor, 
given by Cattaneo et al. (2002) is similar in spirit 
to Fedosov's construction. There one constructs 
two bundles J^ and .7* of associative algebras, 
where - as a bundle ~ J™ is isomorphic to ]™|[v]] 
and /* is the bundle of infinite jets of smooth functions 
on M which is equipped with the canonical flat 
connection Do. The Poisson tensor gives rise to 
the structure of a Poisson algebra on each fiber of /™ 
and the canonical map C* (M) — /* yields a Poisson 
algebra isomorphism between C*(M) and the 
Poisson algebra of Do-flat sections in ]*. The second 
step in the construction consists in a deformation of 
this correspondence. Using the Kontsevich formula for 
R”, each fiber of .7> can be equipped with an 
associative product which is a deformation of the 
above product on the fibers of J® in the direction of 
the Poisson bracket induced by II. Then analogously to 
Fedosov's construction, one constructs a compatible 
connection D = Do + vD4 +v*D2+--- which is a 
deformation of Do. Here compatibility. just means 
that D is a derivation with respect to the above 
product on sections in J% implying that the D-flat 
sections form a subalgebra. Moreover, one can 
achieve that this connection is flat and in this case 
C*(M)[[v]] turns out to be in bijection to the D-flat 
sections in J™%. For the proof of existence of D and for 
its recursive determination using an adaption of 
Fedosov's method, again special cases of Kontsevich's 
formality theorem prove to be the crucial tools. Pulling 
back the above fiberwise product to C*(M)[[v]] via 
this isomorphism, one then obtains a star product on 
(M,II). Since this isomorphism can be determined 
recursively, the star product can in principle be 
computed explicitly. 


See also: Deformation Quantization; Deformation 
Quantization and Representation Theory; Deformation 
Theory; Deformations of the Poisson Bracket on a 
Symplectic Manifold; String Field Theory. 
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Quantum Statistics 


Quantum particles are described by a complex, square- 
integrable wave function V(xi1,...,XN) with |W |? 
representing the probability density of finding N 
particles at positions x1,X5»,...,Xxw, which will be 
assumed to be in a d-dimensional square box V with 
side L and periodic boundary conditions. If the N 
particles are identical, ||? must be totally symmetric in 
the exchange of any pair of coordinates. Regarding the 
symmetry properties of V itself, it is an experimental 
fact (which finds its theoretical explanation in the 
context of relativistic quantum field theory) that only 
two possibilities can arise: either V is symmetric or it 
is antisymmetric, which means that V(x1,..., xN) = 
(—1) V(xp,, ...,Xp,,), where P1,..., PN is a permuta- 
tion of 1,..., N, and (—1)! is the parity of the 
permutation. Particles described by a symmetric wave 
function are called bosons, while particles with an 
antisymmetric wave function are called fermions, after 
Bose and Fermi, who introduced these concepts. The 
fermionic wave function therefore vanishes if two 
coordinates are equal, a property called Pauli exclu- 
sion principle. Particles have an intrinsic quantized 
angular momentum called spin and particles with 
semi-integer spin are fermions, while particles with 
integer spin are bosons. Examples of fermions are 
electrons, protons, or neutrons, with spin c — + 5/2, 
where b is the Plank constant; examples of bosons are 
phonons or mesons with integer spin. 

The time evolution of a wave function is driven 
(through the Schródinger equation) by the Hamilto- 
nian operator, and the choice of such an operator is 
determined by the physical system we want to 


describe. One of the most important physical realiza- 
tions of a fermionic system is given by the conduction 
electrons in solids with a crystalline structure (like 
metals). According to the classical theory of Drude, a 
crystal can be described as a lattice of atoms in which 
the valence electrons are lost by the atoms (which 
become ions) and move freely in the metal; they are 
responsible for the conduction properties of the 
crystal. However, if one assumes that the electrons 
are classical particles (in the sense that they obey the 
Newtonian mechanics), one obtains wrong predictions 
about the properties of crystals. One has to take into 
account that the conduction electrons are quantum 
particles and this provides us with a natural example 
of a fermionic system; the Hamiltonian can be taken as 
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Hn = 一 3$ + cxi) + 》 dv(x; — xj) [1] 


i=] i<j 


The first term represents the nonrelativistic kinetic 
energy of the electrons (m is the mass), uc(x) is a 
periodic potential due to the ions in the lattice 
(c(x)=c(x+R) with R-(ma:,...,ngjag),n; € Z) 
and Av(x — y) is a two-body interaction potential, 
which is modeled by a short-range potential to take 
into account, phenomenologically, the electrostatic 
screening. Finally, A and u are couplings which 
measure the “strength” of the corresponding inter- 
action. Much more complicated and “realistic” 
Hamiltonians could be considered; for instance, 
one can add an interaction with a stochastic field 
to take into account impurities in the lattice, or with 
a boson field to take into account the dynamics of 
the ions, and so on. Note also that one can study not 
only three-dimensional Fermi systems (d=3), but 
also d —2 or d=1 systems; they can describe the 
conduction electrons of crystals that are anisotropic 
and should be considered as bidimensional or one- 
dimensional systems. We focus on the nonrelativistic 


fermionic systems with Hamiltonian [1], which is a 
problem of great importance from both the con- 
ceptual and the applications point of view. 


Second-Quantization Formalism 


The Hilbert space of states of a system of N > 1 
fermions is the space Hy of all the complex square- 
integrable antisymmetric functions Y(x1,...,xyn). Let 
{Op (X)}pepe be a basis for Hı (the one-particle Hilbert 
space of all the complex square-integrable functions 
V(x1)), where k is an index called quantum number. 
Usually, the set of p(x) is chosen as the eigenfunctions 
of the single-particle Hamiltonian 
2 
一 + uc(x) 


for instance, if 4 — 0 then 


l i 


p(X) = Fa 


with bk representing the momentum; due to periodic 
boundary conditions, k has the form k=(27/L)n, 
n—n;,...,n4 with n; integer and —[L/2] <n; € 
[(L — 1)/2]. If we call |k;,..., kx) the normalized 
antisymmetrization of Øp (X1)Op,(%2)--- Og, (XN) 
(Slater determinant), then the set of all possible 
Iki, ..., kN) is a basis for Hy; |k1,..., kN) describes a 
state in which the N fermions have quantum numbers 
ki,... kyn. One can introduce (Negele and Orland 
1988, Berezin 1966) the creation or annihilation 
operators a; , a, : they are anticommuting operators, 


(aja) = apap + aya = y 
{a; ap) = lag. ap) =0 


such that a, Iki, dd ., Rn) = Ik, ki, Se . , Rn) if k X ki, 
i=1,...,N and 0 otherwise; a, is the adjoint of 
a,- The action of a; is to create a particle with 
quantum number & if it is not present in the state, 
and to yield zero otherwise (according to the Pauli 
principle). The state |0) such that a, |0) — 0 for all k 
is called the vacuum state and it represents a 
state with no particles. The Fock space is defined as 
the direct sum of the Hilbert spaces with any 
number of particles, and all the elements of the 
Fock space can be generated by linearly super- 
posing products of creation operators acting over 
the vacuum state. We can extend such definitions 
by adding a label to such operators to take into 
account the spin of the particle; for example, a; , are 
creation or annihilation operators of a particle with 
spin c and position k. In terms of gt = 
Yor bel)ag , and of its adjoint a, „, the Hamiltonian 
can be written as 


2j 


Fermionic Systems 301 


—b92 
e 十 X 一 
H = - [n 2m d c 
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T 3 À i dx | dy v(x ij Vax s y Gy oy [3] 


According to the postulates of quantum statistical 
mechanics, the grand canonical partition function is 
given by Z-—tre-?H-^N), where G=(KT)', & is 
the Boltzmann constant, T is the temperature, ju is the 
chemical potential, N= 5^, | dx aa, „ and tr is the 
trace operation over the Fock space. The thermodyna- 
mical average of an observable O is given by <O >= 
Z ltrle MH-HNIO]. Given a fermionic system, one is 
often interested in its Schwinger functions defined as 
follows: if x= (x,t) and t1 > t2 > --- > t,, seven, then 


—(8—ti) H—uN)41£1 a—(t—t2)( H—uN)4NE2 .. 
fre | pire 1—t2) | yee 
tr e 一 成 及 一 AN 


e 一 上 (用 一 AN) 


[4] 


with ¢;,= +, -— 8/2 < t; < 8/2; periodic and anti- 
periodic boundary conditions are, respectively, 
imposed over x; and t;. From the knowledge of the 
Schwinger functions, one can compute all the 
thermodynamical properties of a system at equili- 
brium or close to equilibrium. 


The Free Fermi Gas 


Computation of the physical observables corre- 
sponding to the complete Hamiltonian [3] is a 
very difficult task. The natural starting point 
consists in taking into account only the kinetic 
term by putting \=u=0 in [3], obtaining the free 
Fermi gas model. The resulting model is not trivial 
at all; its properties are radically different with 
respect to the ones of a gas of classical particles, 
and it is sufficient to understand many properties of 
matter (see, e.g., Mahan (1990)). If 
d(x) = ar 

then |ki,o1,...,RN,ON) are eigenfunctions of 
H with eigenvalue 5^,,c(k)m,,, where el(k)= 
b?\k|*/2m and k s= 0,1, the occupation number, 
is the eigenvalue a% ak „3 Mk, = 1 if in the state there 
is a fermion with momentum k and spin oc, and it is 
zero otherwise. The eigenfunction |Q} of H with 
lowest energy is called the ground state, and it 
determines the low-temperature properties of the 
system. In order to find the ground state |Q), one has 
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to minimize 5 , E(k)Mk, with the constraint that nk , 
can take only the values 0 or 1 and 55, „^k, =N; if 
there are many solutions to this problem, one says that 
the ground state is degenerate. An approximate 
solution is the following: if d — 3, one can consider a 
state such that nko = 1 if k is in a sphere of radius kf 
and zero otherwise; since the number of momenta 
k — (2x/ L)n in the sphere is approximately given by 
4nk£ L? 

3 8r? 
we can choose kj—(3z20)?, with p=NL>. 
The state [ Lig cp 41/5 44^ J2,k |0) is not the true ground 
state when N,L are finite, but it is a very good 
approximation of it and converges to it (in a suitable 
sense) in the limit N,L — oc, p fixed. The boundary of 
the sphere with radius ky in the space of momenta is 
called the Fermi surface, and it is a key notion in the 
theory of Fermi systems; if d — 2, it is replaced by a 
circle and in d — 1 by two points. 

Coming to the thermodynamical properties, the 
partition function is given by 


I] b e "(um 一 |a J e PH) 


nj —0,1 


2 


and the specific heat by 


9 0 

OT OB 
One finds, by expressing jz in terms of 5 through the 
relation N= — Of /Op, that if d —3, in the L — oc 


limit 
: o 
(sas (5) FO (=*) 
2 EF EF 


where e¢=h°k2/2m. Early models for metals 
described the electrons as classical particles; however 
in such a case, a well-known result of classical 
statistical mechanics states that they should contribute 
to the specific heat by 3 px, while experimentally their 
contribution is much smaller. The solution of this 
puzzle was provided by the above formula for C,; the 
classical value is in fact depressed by a factor 


G= log Z 


which at room temperatures is O(10~*), in agree- 
ment with experimental data. The average number 
of electrons with momentum bk is given, in the 
infinite-volume limit, by 


(ais) = (1 efle- 


At zero temperature, it reduces to O(|k| < kp), that 
is, it has a discontinuity at the Fermi surface, while 


at high temperatures it is very close to the Maxwell 
distribution ~e 4"), 

Finally, in the free Fermi gas model, all Schwinger 
functions can be computed. One finds that, if, for 
instance, &j—4- for i=1,2,...,s/2 and &gj-- 
otherwise, that the Schwinger function with s > 4 
can be expressed as sum of products of the s=2 
Schwinger function (also called the propagator) 


BULL, 2 =. 2 71 [S — X«(j)) [5] 


where 1:1,...,5/2, j—s/2-- 1,...,5, *; iS a per- 
mutation of j=s/2 + 1,...,s,(— 1)" is the parity of 
this permutation, 5;. is the sum over all the 
possible permutations; such a formula is called the 
Wick rule. By an explicit computation, So(x — y) is 
given by 


2n (= d eik(x—y) 


ae: a 
B ko—27(no4-1/2)87! L k=(27/L)n —iko m L3 /2m —H 


d 

2m 2m "T-—- 

ul! 》 £3 》 eik(x Y) So(k) 
ko—25(no--1/2)8-! k=(27/L)n 


[6] 


where k=(ko,k). In the limit L, 8 — oo, for large 
distances S(x, y) decays as a power law, O(|x —y| ) 
times an oscillating function of period &;!. Note that 
So(k) in the limit 8, L — oo diverges for ko — 0 and 
e(k) =, that is, at the Fermi surface (u — ey in the 
limit 9 — oc); when £ is finite, So(k) is finite even for 
L — œ, that is, the finite temperature acts as an 
infrared cutoff. 


Fermions in an External Potential 


The next step consists in adding an external periodic 
potential to the free Fermi gas model, taking into 
account the field generated by the ions of the lattice. 
We consider then [3] with A — 0 and u Æ 0. As in the 
previous case, the eigenfunctions of the N-particle 
Hamiltonian can be computed and are expressed in 
terms of the single-particle eigenfunctions of 
_h 02 /2m--uc(x) they are called Bloch waves 
and have the form 


kx 


eux), Up(X) = up(x +R) 


r(x) = 


1 
v L4/2 


k, called the crystalline momentum, is conserved 
modulo G, the vectors of the reciprocal lattice, 
defined as 


The eigenvalue e(k) of —h’ a? / 2m + uc(x) associated 
with a Bloch wave $x(x) has some peculiar properties; 
in the L — oo limit, one finds that e(k) is not a 
continuous function (unlike the 4 — 0 case) but it has 
gaps, that is, first-order discontinuities. For d— 1, 
by a convergent power-series expansion in z, one 
finds that s(R) is a continuous monotonically increas- 
ing function except at the points 4777/4, n an integer; 
at these points ¢(R) is discontinuous and &((nz/a)* ) 一 
e((nz/a) ) = A, =u, + O(u?); the gaps divide e(k) 
into disconnected pieces called energy bands. Some- 
thing similar happens in d — 2,3, in which gaps open 
for R such that G^ + 2kG=0. 

Again, the eigenfunctions of H are given by 
|Iki,01,...,kw,oN) with eigenvalue 5°, ,€(k)Mg,o, 
and the Fermi surface is still defined by the set k 
such that e(k)—ep with ep determined by the 
condition 5 `p.) € EF 1 — N. However, in this case 
the Fermi surface is not anymore a sphere in d — 3, 
but it is in general a polyhedron of a very complex 
shape. The Schwinger functions are expressed by the 
Wick rule [5] in terms of the two-point Schwinger 
functions; they are given by [6] with e**-» replaced 
by óx(x)ó;(y) and |k|^ /2m replaced by &(k). The 
asymptotic properties of the two-point Schwinger 
function are quite different with respect to the 4 — 0 
case. This is easy to see if d= 1; in the limit L, 8 — oc, 
S(k) is singular if u does not belong to the interval 
[e((nm/a)*),e((mx/a) )), whereas it is finite if pu 
belongs to such an interval; in the first case, S(x, y) 
decays for large distances as O(|x — y| ), whereas in 
the second case it is O(e-!^»/*-»), This means that, 
depending on the number of particles (which essen- 
tially fixes yz), the Schwinger function has a totally 
different asymptotic behavior. This fact has impor- 
tant consequences in many physical properties; for 
instance, the conductivity (which can be computed 
from the s=4 Schwinger function) vanishes if p 
belongs to the interval [e((»7/a) ' ), £((nz/a) )]. Simi- 
lar properties hold for d=2,3; hence, from the 
knowledge of the number of particles and the periodic 
potential generated by the ions, one can predict if the 
system is an insulator or a metal. 

Note also that the conductivity is infinite in the 
infinite- volume and zero-temperature limit, when ju 
does not correspond to a gap; in other words, the 
electric current in a perfect crystal lattice is not 
subjected to any dissipation of energy. A finite 
resistivity is found only if one takes into account 
deviations from perfect periodicity. To simulate 
impurities in the lattice, one can add, according to 
Anderson, to the Hamiltonian an interaction term 
of the form aó«v;v., where ó, is a Gaussian 
stochastic field. A detailed mathematical investi- 
gation has been devoted to the properties of 
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eigenfunctions of -p0 /2m + aó,, where dy is a 
Gaussian field (see, e.g., Pastur and Figotin (1991)); 
it is found that if a is large enough in d — 2,3 and 
for any a in d — 1, the single-particle eigenfunctions 
are exponentially localized, that is, they decay 
exponentially at large distances; this implies a finite 
conductivity. One can also add to the Hamiltonian a 
term Boyi v., with @ a quasiperiodic function, in 
order to describe crystals in which the lattice 
develops a periodic distortion, with incommensurate 
period with respect to the lattice periodicity. For 
d=1 and 8 large, one again finds localized 
eigenfunctions, whereas for small 5 there are 
extended states (see, e.g. Pastur and Figotin 
(1991); such results are obtained with the 
Kolgomorov—Arnol’d—Moser (KAM) techniques. 


Interacting Systems 


The analysis of noninteracting Fermi systems has 
been very successful in understanding qualitatively 
many features of crystals, but there are many 
properties (e.g., superconductivity or magnetism) 
which cannot be really explained without taking 
into account the interaction between fermions; 
however, the analysis becomes more involved. 
When there is no interaction, the properties of 
the many-body system can be understood in terms of 
the single-body properties; the eigenfunctions of the 
Hamiltonian are, in fact, obtained in terms of the 
single-particle eigenfunctions. This is not true when 
à #0 when a description of the system in terms of 
independent particles is impossible. In order to 
compute the interacting Schwinger functions, it is 
convenient to write them in terms of fermionic 
functional integrals (Berezin 1966). One introduces 
a set of anticommuting Grassmann variables v , v; , 
k — (ko, k); the Grassmann integration is defined by 
J dégv, —1 and f dv; —0, o= +, and the integral 
of any analytic function of the Grassmann variables 
can be obtained by expanding it in Taylor series 
(which is a finite sum if suitable cutoffs are imposed 
and L, 3 are finite) and using the above rules; finally, 


1 iko 
ys = 1452. elje as 


and wv, is defined in an analogous way. The 
Schwinger function can be written as a Grassmann 
integral as follows: 


$(X1,X2,...,XN) 
ðN 


EE UNE" | -v+ | dxóz vs 
Ads, . .. 005 log | Pld ERR 
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where P(dy) is the fermionic integration 
[IT dug dv exp [374 V [— iko + e(k) — uv, l, while 
y — (y, s), and 


v- AY. | dx dy vix - y) 


x ó(t — s) V, Vya Dyo [8] 
The Grassmann integral of a monomial of Grassmann 
variables can be obtained by the Wick rule [5] with 
propagator the Fourier transform of (~ ikọ + e(k) 一 
p) !. As stated earlier, the propagator is finite at 
nonzero temperature, whereas if G=oo, then it is 
singular when k=(ko,k) is such that ko=0 and 
e(k) =p. 

One can write [7] as a series by Taylor-expanding 
the exponential and using the Wick rule; each order 
of the expansion can be represented as a sum of 
Feynman diagrams, very similar to the ones appear- 
ing in quantum field theory. We have then an 
algorithm to compute [7]; nevertheless, to extract 
information from such a series is quite difficult. One 
cannot really compute an infinite (in the L= oc 
limit) number of coefficients, so one is tempted, for 
small A, to compute only the first few of them, 
neglecting the others. However, it appears that this 
approximation is generally not justified, and it leads 
to wrong results; the reason is that the Schwinger 
functions for A=0 or A £0 are not analytically 
close, or, in more physical terms, even if A is small, 
the physical behavior of the free and interacting 
theories can be quite different, especially at low 
temperatures. A number of very interesting concepts 
(e.g., spontaneous symmetry breaking or the mass 
generation phenomenon), or techniques (e.g., the 
renormalization group method, or the parquet or 
random phase approximation) have been introduced 
in the last 50 years to analyze [7], and indeed many 
results have been obtained which explain several 
physical properties of the matter, such as super- 
conductivity or the Kondo effect (see, e.g., Anderson 
1985, Abrikosov et al. 1965, Mahan 1990, Negele 
and Orland 1988, Pines 1961). Unfortunately, most 
of such results are not really mathematically 
consistent, and in many cases quantitative computa- 
tions are impossible (in computations one generally 
neglects terms which, according to a heuristic 
physical intuition, are irrelevant, but no control of 
the error introduced by this approximation is 
attempted). In recent times, attempts towards a 
mathematical understanding of the functional inte- 
gral [7] have started (see, e.g., Benfatto and 
Gallavotti (1995), and references therein); the 
methods rely on the mathematical implementation 
of Wilson’s renormalization group methods via 


multiscale analysis (Gallavotti 1985). The necessity 
of a firmer mathematical basis was felt mainly 
under the pressure of the recent discovery of high- 
T. superconductors whose behavior is still not 
understood in terms of the microsopic model [7]; 
this has forced reconsideration of the validity of the 
approximations usually made in the analysis of this 
model. 

The behavior of |7] depends crucially on the 
temperature. At high temperatures, we can simply 
expand the exponential in [7] in a power series of A, 
and find that each Feynman graph contributing to 
the mth perturbative order is bounded by C"|A|", 
with C; < C/? for some constants C, ^; this follows 
immediately by using the Wick rule and by 
remembering that the propagator is larger than 
O(3-'). As the number of Feynman graphs con- 
tributing to order n is O(n!), a bound on each 
Feynman graph is not sufficient to prove the 
convergence of the series. To prove convergence, 
one has to take into account cancellations, due to 
the anticommutativity of fermionic variables. Such 
cancellations are proved via Gram's inequality for 
determinants and a bound C^" can be obtained for 
the order n (without factorials); hence, convergence 
follows for temperatures greater than O(|A|") for 
some constant a>Q. One finds that 
S(k) =So(k)(1 + A, (k)) with |Ay(k)| < CJA|, that is, 
the interaction has essentially no influence on the 
physical properties of the system at high 
temperatures. 


Landau Fermi Liquids 


We consider next an intermediate region of tem- 
peratures, that is, e ^/^' < T < |A|" for some con- 
stants a,a. In this region, the naive expansion in 
power series of A fails and other techniques, such as 
renormalization group, are necessary. Such a 
method allows us to perform a suitable resummation 
of the naive power series in A, and one gets, for A 
small enough, T > e-4/^ and elk) = |k|* /2m, 


" ] 1+ AX (K) 

$09 — 20) ik vOQUE - ER] =F 
where Z(A)—1--z(A),vg(A) - bhp/m --v(A), and 
ke(A)=ke +A), with |. 20) O2), (4) =O), 
v(\) -2O(M), and z(A),v(A),vg(A) essentially tem- 
perature independent; moreover, |Aj(k) is O(A). 
The above formula has been proved rigorously for 
d —2 (see Rivasseou (1994), and references therein); 
for d —3, it has been proved at the level of formal 
perturbation theory (Benfatto and Gallavotti 1995). 
The case e(k) =|k|*/2m is quite special, as the shape 


of the interacting Fermi surface is fixed by the 
rotation-invariant symmetry; it is necessarily circular 
(d=2) or spherical (d — 3), whereas in general the 
interaction can also modify its shape. For d — 2, if the 
interacting Fermi surface is symmetric, smooth and 
convex, a formula like [9] still holds (with a function 
ke(A,k) replacing kr(A) up to exponentially 
small temperatures (see references in Gentile and 
Mastropietro (2001)). 

It is apparent from [9] that one cannot derive such 
a formula from a power-series expansion in A; by 
expanding [9] as a series in A, one immediately finds 
that the mth term is O(\""), which means that the 
naive perturbative expansion cannot be convergent 
up to exponentially small temperatures. It can be 
derived only by selecting and resumming some 
special class of terms in the original expansion. A 
peculiar property of [9] is that the wave function 
renormalization Z(A) is essentially independent of 
the temperature. Such temperature independence is a 
consequence of cancellations in the perturbative 
series essentially due to the curvature of the Fermi 
surface. For d=1, a formula similar to [9] is also 
valid; however, such cancellations are not present 
and one finds Z(A)=1-+ O(M log 8). Comparing 
S(k) given by [9] with the Fourier transform So(k) of 
[6], we note that the Schwinger function of the 
interacting system is still very similar to the 
Schwinger function of a free Fermi gas, with 
physical parameters (e.g., the Fermi momentum, 
the wave function renormalization, or the Fermi 
velocity) which are changed by the interaction. This 
property is quite remarkable: the eigenstates cannot 
be constructed when A — 0 starting from the single- 
particle states but, nevertheless, the physical proper- 
ties of the interacting system (which can be deduced 
from the Schwinger functions) are qualitatively very 
similar to the ones of the free Fermi gas, although 
with different parameters; this explains why the free 
Fermi gas model works so well to explain the 
properties of crystals, although one neglects the 
interactions between fermions which are, of course, 
quite relevant. A fermionic system with such a 
property is called a Landau Fermi liquid (see, e.g., 
Arbikosov et al. 1965, Mahan 1990, Pines 1961), 
after Landau, who postulated in the 1950s that 
interacting systems may evolve continuously from 
the free system in many cases. 

It was generally accepted that metals in this range 
of temperatures were all Landau Fermi liquids 
(except one-dimensional systems). However, the 
experimental discovery of the high-T, superconduc- 
tors (see, e.g., Anderson (1997)) has changed this 
belief, as such metals in their normal state, that is, 
above T, are not Landau Fermi liquids; their 
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wave function renormalization behaves like 1 + 
O(\* log 3) instead of 1-- O(M) as in Landau 
Fermi liquid. This behavior has been called 
marginal-Fermi-liquid behavior and many attempts 
have been devoted to predict such behavior from [7]. 
In order to see deviations from Fermi liquid 
behavior, one could consider Fermi surfaces with 
flat or almost flat sides or corners (which are quite 
possible; e.g., in a square lattice with one conduc- 
tion electron per atom, such as in the “half-filled 
Hubbard model”). 

Let us finally consider the last regime, that is, 
temperatures lower than O(e-^/^). Except for very 
exceptional cases (e.g., asymmetric Fermi surfaces, 
i.e, such that e(k) z e(— k) except for a finite 
number of points, in which Fermi liquid behavior 
is found down to T —0 (Feldman et al. 2002)), a 
strong deviation from Fermi liquid behavior is 
observed; the interacting Schwinger function is not 
similar to the free one and the physical properties in 
this regime are totally new. 


One-Dimensional Systems up to 7 —0 


The only case in which the Schwinger functions of 
the Hamiltonian [3] can be really computed down to 
T — 0 occurs for d — 1; in such a case, an expression 
like [9] is not valid anymore and the system is not a 
Fermi liquid. On the contrary, when u=0 and for 
small repulsive A > 0, one can prove, for spinning 
fermions (see Benfatto and Gallavotti (1995), Gentile 
and Mastropietro (2001) and references therein) that 
n(A) 


[K + vEOXRL— eO 


a —iko + vg(A)[|R| — kg(A)] n Li AX(K) 


[10] 


where kp(à)= kr + O(A) and 7(A) = aM + OA?) is a 
critical index. This means that the interaction 
changes qualitatively the nature of the singularity 
at the Fermi surface; S(k) is still diverging at the 
Fermi surface but with an exponent which is no 
longer 1 but is 1 — 25(A), with (A) a nonuniversal 
(i.e., À-dependent) critical index. As a consequence, 
the physical properties are different with respect to 
the free Fermi gas; for instance, the occupation 
number n+ is not discontinuous at k= + kp(A) when 
T —0. Nonuniversal critical indices appear in all 
the other response functions. Fermionic systems 
behaving in this way are called Luttinger liquids, 
as they behave like the exactly solvable Luttinger 
model describing relativistic spinless fermions 
with linear dispersion relation. The solvability of 
this model, due to Mattis and Lieb (1966), relies 
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on the possibility of mapping its Hamiltonian in a 
system of free bosons. Such a mapping is not 
possible for the Hamiltonian [3], which is not 
solvable; however, one can use renormalization 
group methods and suitable Ward identities to 
show that its behavior is similar to the Luttinger 
model (in a sense, one makes perturbation theory 
not around the free Fermi gas, but around the 
Luttinger model). 

If we take into account the interaction with an 
external periodic potential with period a, that is, 
consider u #0, we find that if kp Z n/a, then the 
Schwinger function behaves essentially like [8]. On 
the contrary, in the filled-band case, kp =n7/a, one 
finds that there is still an energy gap which becomes 
O(ul^") with ;=O(A); this means that the 
renormalization of the gap is described by a critical 
index; moreover, S(x) ~ O(e 4 "ld ^A similar 
behavior is also observed in the presence of quasi- 
periodic potential. In the attractive case, A « 0, 
4—0, the behavior is much less understood; it is 
believed that the interaction produces a gap Ay, in 
the spectrum which is nonanalytic in A, and S(x) 
shows an exponential decay rather than a power- 
law decay, and the interaction converts the system 
from a metal to an insulator. 

Finally, it is remarkable that a large variety of 
models, like Heisenberg spin chains or bidimensional 
classical statistical mechanics models, such as the 
eight-vertex or the Ashkin-Teller model, can be 
mapped into interacting d — 1 fermionic systems, and 
consequently their critical behavior can be understood 
by using fermionic techniques (see Gentile and 
Mastropietro (2001), and references therein). 


Superconductors 


The theory up to T —0 for d=2,3 systems with 
dispersion relation |k|^/2; is based only on 
approximate computations, predicting the phenom- 
enon of superconductivity. According to the theory 
of Bardeen, Cooper, and Schrieffer (BCS theory), 
the interaction between fermions leads to the 
formation of a gap in the energy spectrum, below 
the critical temperature. There are many ways to 
derive the BCS theory. One is based on the fact that 
one verifies, by perturbative computations, that the 
effective interaction is stronger when the four 
momenta of the fermions are such that kų œ —&; 
and kœ —k4. This suggests, heuristically, to 
replace in [7] v with 


1 
P + jt - s 
VBCS = -AT Vko V k -ok -o RP 
k,k' 


which is an interaction between pairs of electrons 
with opposite spin and momenta, which are called 
Cooper pairs. Replacing v with vgcs has the great 
advantage that it makes the Schwinger functions 
exactly computable and explains the mechanism 
of superconductivity in many metals (but not in 
the recently discovered high-T, superconductors). 
On the other hand, proving that [7] with v or vgcs 
has a similar behavior is still an important open 
problem. The two-point Schwinger function in the 
model with vpcs can be written, after the so-called 
Hubbard-Stratonovitch transformation, as 


~iko—e(k)+tp &—8L*v(u) d 


I1.-3755.3,5 € 
ki +e? (k)+Au2 


feeble) du 


"i 


S(ko, k) = —8L4- [11] 


where v(u) is a function with a global minimum in 
4 —0 for repulsive interactions À < 0, whereas for 
A20 and sufficiently small temperatures (for 
T € Ta, with T. = O(e7?/^)), it has the form of a 
double well with two minima at w=+A) with 
A = O(e-7/^; for T greater than T., there is only 
a global minimum at u=0. By the saddle-point 
theorem, we find, for T < T, and A < 0, 


—iko = elk) + pb 


tuer SORT ai tS Rt 
mw k2 + (e(k) = p) + A2 


[12] 

L—2oo 
The physical properties predicted by [12] are 
completely different with respect to the free case: 
the occupation number is continuous, there is an 
energy gap in the spectrum, the specific heat is 
O(e-^^T) and the phenomenon of superconductivity 
appears. The fact that the interaction generates a 
gap is called mass generation; a similar mechanism 
appears in particle theory. 


Conclusions 


Many other physical phenomena, observed experi- 
mentally, can be essentially understood by studying 
fermionic systems, but a clear mathematical com- 
prehension is still lacking. We mention: the Kondo 
effect, that is, the resistance minimum observed in 
some metals due to magnetic impurities; Mott 
transition, in which a strong interaction produces 
an insulating state in a system which should be 
conductors; antiferromagnetism; fractional quantum 
Hall effect, and many others. We can say that the 
situation in this area of study reminds one of the 
classical mechanics at the end of the nineteenth 
century; there is agreement on the models to 
consider, which are believed to be able to take into 


account the marvelous properties of the matter 
experimentally found, but to extract information 
from them requires deeper and complex analytical 
and mathematical investigations. 


See also: Falicov-Kimball Model; Fractional Quantum 
Hall Effect; Quantum Statistical Mechanics: Overview; 
Renormalization: Statistical Mechanics and Condensed 
Matter. 
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Introduction 


In nonrelativistic quantum mechanics, the state of 
a d-dimensional particle is represented by a 
unitary vector y in the complex separable Hilbert 
space I? (R4), the so-called “wave function,” while 
its time evolution is described by the Schrödinger 
equation: 


at ps 
(0, x) = Wo(x) 


where b is the reduced Planck constant, m > 0 is the 
mass of the particle, and F— —VV is an external 
force. 

In 1942 R P Feynman, following a suggestion by 
Dirac, proposed an alternative (Lagrangian) formu- 
lation of quantum mechanics, and a heuristic but 
very suggestive representation for the solution of eqn 
[1]. According to Feynman, the wave function of the 
system at time ¢ evaluated at the point xc R^ is 
given as an “integral over histories," or as an 
integral over all possible paths y in the configuration 
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space of the system with finite energy passing at the 
point x at time f: 


一 上 


S,(^) is the classical action of the system evaluated 
along the path ^ 


$(3)2 S (y) - j V((s)) ds 3 


( 
$0027 | MOP as 4 


D^ is a heuristic Lebesgue “flat” measure on the 
space of paths and yx el'/) Se (y) D») is a 
normalization constant. 

Some time later, Feynman himself extended 
formula [2] to more general quantum systems, 
including the case of quantum fields. 

The Feynman path-integral formulation of quan- 
tum mechanics is particularly suggestive, as it 
provides a spacetime visualization of quantum 
dynamics, reintroducing in quantum mechanics the 
concept of trajectory (which was banned in the 
*orthodox interpretation" of the theory) and creat- 
ing a connection between the classical description of 
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the physical world and the quantum one. Indeed, it 
provides a quantization method, allowing, at least 
heuristically, to associate a quantum evolution to 
each classical Lagrangian. Moreover, the application 
of the stationary-phase method for oscillatory 
integrals allows the study of the semiclassical limit 
of the Schródinger equation, that is, the study of the 
detailed behavior of the solution when the Planck 
constant is regarded as a parameter converging to 0. 
Indeed, when b is small, the integrand in [2] is 
strongly oscillating and the main contributions to 
the integral should come from those paths y that 
make stationary the phase function S(y). These, by 
Hamilton's least action principle, are exactly the 
classical orbits of the system. 

Feynman path integrals allow also a heuristic 
calculus in path space, leading to variational 
calculations of quantities of physical and mathe- 
matical interest. An interesting application can be 
found in topological field theories, as, for 
instance, Chern-Simons models. In this case, 
heuristic calculations based on the Feynman 
path-integral formulation of the theory, where 
the integration is performed on a space of 
geometrical objects, lead to the computation of 
topological invariants. 

Even if from a physical point of view, formula [2] 
is a source of important results, from a mathema- 
tical point of view, it lacks rigor: indeed, neither the 
"infinite-dimensional Lebesgue measure," nor the 
normalization constant in front of the integral is 
well defined. In this article, we shall describe the 
main approaches to the rigorous mathematical 
realization of Feynman path integrals, as well as 
their most important applications. 


Possible Mathematical Definitions 
of Feynman's Measure 


In the rigorous mathematical definition of Feynman's 
complex measure 


一 | 
Hake J MSO Dy) el/ Dy [s 
{yly(t)=x} 


one has to face mainly two problems. First of all, the 
integral is defined on a space of paths, that is, on an 
infinite-dimensional space. The implementation of 
an integration theory is nontrivial: for instance, it is 
well known that a Lebesgue-type measure cannot be 
defined on infinite-dimensional Hilbert spaces. 
Indeed, the assumption of the existence of a 
c-additive measure p which is invariant under 
rotations and translations and assigns a positive 
finite measure to all bounded open sets leads to a 


contradiction. In fact, by taking an orthonormal 
system [ejje in an infinite-dimensional Hilbert 
space H and by considering the open balls 
B; = {x EH, ||x — e;||< 1/2}, one has that they are 
pairwise disjoint and their union is contained in the 
open ball B(0, 2) = {x € H, ||x|| <2}. By the Euclidean 
invariance of the Lebesgue-type measure pz, one can 
deduce that ju(B;) =a, 0 « a « oc, for all i€ N. By the 
c-additivity, one has 


p(B(0, 2)) > u(U;B;) 


- EuB) = 


but, on the other hand, 1(B(0, 2)) should be finite as 
B(0,2) is bounded. As a consequence, we can also 
deduce that the term Dy in [2] does not make sense. 

The second problem is the fact that the exponent 
in the density et/ is imaginary, so that the 
exponential oscillates. Even in finite dimensions, 
integrals of the form fen e®™f(x)dx, with 
®,f:RN —R are continuous functions and f is not 
summable, have to be suitably defined, in order to 
exploit the cancelations in the integral due to the 
oscillatory behavior of the exponential. 

The study of the rigorous foundation of Feynman 
path integrals began in the 1960s, when Cameron 
proved that Feynman's heuristic complex measure 
[5] cannot be realized as a complex bounded 
variation c-additive measure, even on very nice 
subsets of the space (R7)?! of paths, contrary to 
the case of complex measures on R" of the form 
e/2x dx. In other words, it is not possible to 
implement an integration theory in the traditional 
(Lebesgue) sense. As a consequence, mathemati- 
cians tried to realize [5] as a linear continuous 
functional on a sufficiently rich Banach algebra of 
functions, inspired by the fact that a bounded 
measure can be regarded as a continuous functional 
on the space of bounded continuous functions. 
In order to mirror the features of the heuristic 
Feynman's measure, such a functional should have 
some properties: 


1. it should behave in a simple way under “transla- 
tions and rotations in path space,” as Dy denotes 
a “flat” measure; 

2. it should satisfy a Fubini-type theorem, concern- 
ing iterated integrations in path space (allowing 
the construction, in physical applications, of a 
one-parameter group of unitary operators); 

3. it should be approximable by finite-dimensional 
oscillatory integrals, allowing a sequential approach 
in the spirit of Feynman's original work; and 

4. it should be sufficiently flexible to allow a rigorous 
mathematical implementation of an infinite- 
dimensional version of the stationary-phase 


method and the corresponding study of the 
semiclassical limit of quantum mechanics. 


Nowadays, several implementations of this program 
can be found in the literature of physics and mathe- 
matics, for instance, by means of analytic continuation 
of Wiener integrals, or as an infinite-dimensional 
distribution in the framework of Hida calculus, or via 
“complex Poisson measures," or via nonstandard 
analysis, or as an infinite-dimensional oscillatory inte- 
gral. The last of these methods is particularly interesting 
as it allows the systematic implementation of an infinite- 
dimensional version of the stationary-phase method, 
which can be applied to the study of the semiclassical 
limit of the solution of the Schródinger equation [1]. 


Analytic Continuation 


In one of the first approaches in the definition of 
Feynman path integrals, formula [2] was realized as 
the analytic continuation in a suitable complex 
parameter of a (nonoscillatory) Gaussian integral 
on the space of paths. 

In 1949, inspired by Feynman's work, M Kac 
observed that by considering the heat equation 


[6] 
u(0, x) = wWo(x) 


instead of the Schrödinger equation [1] and by 
replacing the oscillatory term e"/”)5° in Feynman 
complex measure with the fast decreasing one 
e 0/P$500. it is possible to give a well-defined 
mathematical meaning to Feynman's heuristic for- 
mula [2] in terms of a well-defined integral on the 
space of continuous paths W; x= {w € C(0,t; R4): 
w(0)—x] with respect to the Wiener Gaussian 
measure P, x: 


urx)- A at V(/1/mw(r))dr 
x io(/1/m w(t)) dP, x(w) [7] 


The path-integral representation [7] for the solution of 
the heat equation [6] is called Feynman-Kac formula. 
The underlying idea of the analytic continuation 
approach comes from the fact that by introducing in 
[6] a suitable parameter A, proportional, for 
instance, to the time ¢ as in the case 入 一 入 1， 
0 1 


d ad 
Mb aru = 2" Au + V(x)u 


it, ai= 人 e (b) f VG b (mdi wn) dr 


x vo b/(mA)ywtt)) dP, .(w) 
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or to the Planck constant, as in the case À= A2, 
0 


l ,> 
入 2 a^ = 5, ^5" + V(x)u 


"e J e (1/29) fy VOV Az/mw(r)) dr 
x poly à2/m w(t)) dP, «(w) 


or to the mass, as in the case À = As, 


0 1 j 
a^ = T aM — iV(x)u 
NN =] e f, V(./1/A3w(r)) dr 
Wix 
x poly 1/Astw(t)) dP; x(w) 


and by allowing A to assume complex values, then one 
gets, at least heuristically, Schrodinger equation and its 
solution by substituting, respectively, A; = —i, A2 = ib, 
or À3— —i7. These procedures can be made com- 
pletely rigorous under suitable conditions on the 
potential V and initial datum v». 


The Approach via Fourier Transform 


This approach has its roots in a couple of papers 
by K Ito in the 1960s and was extensively 
developed by S Albeverio and R Hgegh-Krohn in 
the 1970s. The main idea is the definition of 
oscillatory integrals with quadratic phase function 
on a real separable Hilbert space (H,(-,-)), the 
Fresnel integrals, 


— 


$ elle” f(x) dx 8] 
H 


as the distributional pairing between eti/ 2 为 zl and 
a complex-valued function f belonging to the 
space F(H) of functions that are Fourier trans- 
forms of complex bounded variation measures on 


H, that is, 


fii, x)= fe dn) 
F(H) is a Banach algebra, where the product is the 
pointwise one and the identity is the function 
f(x) =1VxEH. The norm of an element f is the 
total variation of the corresponding measure ji, that 
is, ||uy]| = sup >; |uj(E;)), where the supremum is 
taken over all sequences {E;} of pairwise-disjoint 
Borel subsets of H, such that U;E; =H. 

Given a function f € F(H),f=ñĤp, its Fresnel 
integral is defined by the Parseval formula: 


Í eM NEI? f(x) dx := [ e- DIM djs(x) [9] 


H 
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where the right-hand side is a well-defined absolutely 
convergent integral with respect to a o-additive 
measure on H. 

It is important to recall that this approach 
provides the implementation of a method of 
stationary phase for the expansion of the integral 
in powers of the small parameter / occurring in the 
integrand. We postpone the discussion of these 
results, as well as the application to the solution 
of the Schrödinger equation, to the next section 
where a generalization of the present approach is 


described. 


Infinite-Dimensional Oscillatory Integrals 


The main idea of this approach is the extension of 

the definition of oscillatory integrals with quadratic 

phase function [8] to infinite-dimensional Hilbert 

spaces by means of a twofold limiting procedure. 
The study of integrals of the form 


I(b):= 人 e U/P9 C) f(x) dx 110) 


where (x):R" R is the phase function and 
f:R'—C a complex-valued continuous function, 
is a classical topic, largely developed in connection 
with various problems in mathematics (such as the 
theory of pseudodifferential operators) and physics 
(such as optics). Particular effort has been devoted 
to the study of the detailed behavior of the above 
integral in the limit of “strong oscillations," that is, 
when 5 — 0, by means of the method of stationary 
phase. 

Thanks to the cancellations due to the oscillatory 
term e''/2/)®(*) | the integral can still be defined, even if 
the function f is not summable, as the limit of a 
sequence of regularized, hence absolutely convergent, 
integrals. According to a Hormander's proposal, 
the oscillatory integral of a function f: R" — C is 
well defined if, for each test function $ € S(R^), such 
that $(0) — 1, the limit 


lim | e6/25)9 (9 Sex MF (x) dx 
c0 JRN 
exists and is independent of ó. 

This definition has been generalized in the 1980s 
by D Elworthy and A Truman to the case where the 
underlying space R^ is replaced by a real separable 
infinite-dimensional Hilbert space (H, (-,-)), under 
the weg ne that the phase function is quadratic, 
that is, ®(x)=||x||7/2. The “infinite-dimensional 
oscillatory vines 


—— 


f el 2PM? f(x) dx 
H 


is defined as the limit of a sequence of finite- 
dimensional approximations. More precisely, a 
function f:H—C is “integrable” if, for each 
increasing sequence {Pp} en of finite-dimensional 
projector operators in H converging strongly to the 
identity operator as n — oc, the limit 


一 | 
lim ( f o(i/2h)| Psal dP yx 
HX NIP 
x / eli/2h) Pull” FP, x) dP,x [11] 
PL 


exists and is independent of the sequence {Pr} eN. 
In this case, the limit is denoted by 


— 


f eU 2D f(x) dx 
H 


The description of the largest class of integrable 
functions is still an open problem, even in finite 
dimension, but it is possible to find some interesting 
subsets of it. In particular, any function belonging 
to F(H), the Banach algebra considered in the 
approach by Fourier transform, is integrable. Indeed, 
by assuming that the function f in [11] is of the type 


f(x) =e g(x) 


where L:H—H is a linear self-adjoint trace-class 
operator on H such I (I — L) is invertible and 
gEF(H), that is, g(x) = fy e») dig (y ), then it is 
possible to prove that f is integrable in the sense of 
definition [11] and the corresponding infinite- 
dimensional oscillatory integral can be explicitly 
computed in terms of a well-defined integral with 
respect to a bounded variation measure Mr by means 
of the following Parseval's type equality: 


(1/b)(x.Lx) 


—— 


/ eli/2P) il e GPL) g(x) dx 
H 


-deil - L^ | e “PPE Saut) — [12] 
H 


det (I — L) being the Fredholm determinant of the 
operator I — L, that-is, the product of its eigenvalues, 
counted with their multiplicity. If L=0, then we 
obtain eqn [9], so that we can look at the infinite- 
dimensional oscillatory integrals approach as a gen- 
eralization of the Fourier transform approach, since it 
allows at least in principle to integrate a class of 
function larger than F (71). In fact, recently this feature 
has been used by S Albeverio and S Mazzucchi in the 
proof of a Parseval's type equality similar to [12] for 
infinite-dimensional oscillatory integrals with poly- 
nomially growing phase functions. 

Feynman's heuristic formula [2] for the representa- 
tion of the solution of the Schródinger equation [1] can 


be realized as an infinite-dimensional oscillatory integral 
on the Hilbert space 71, of absolutely continuous paths 
3: [0,2] ER? wil wx endpoint ^?(f)—O and finite 
kinetic energy ^ 3^ (7) dr z T wars with the inner 
product (91,72) zx ^i(T . One has to take 
an initial datum v € ES te is the Fourier trans- 
form of a à bounded variation measure on R, 
that is, wo(x = Jr, e'**diio(k). Moreover, one has to 
assume that da potential V in [1] is the sum of a 
harmonic oscillator part plus a bounded perturbation 
V4 that is the Fourier transform of a complex bounded 
variation measure ju, on R: 


V(x) = 4x x+ Vi(x) 
Vi(x) = | e^*dy, (k) 


(Q^ being a symmetric positive d x d matrix). 
In this case, it is possible to prove that the linear 
operator L on H; defined by 


(% Lej) | y(r)e()dr 


is self-adjoint and trace class, and (J — L) is invertible. 
Moreover, by considering the function v: H; —^ C 


v(y)- | Vi(y(7) + x)dr 


7 
+ 2x0? | *y(r)dr, yEH, 
0 


it is possible to prove that the function f:H:— C 
given by 


f(y) = e MM ho (-y(0) + x) 


is the Fourier transform of a complex bounded 


variation measure ji on H, and the infinite- 
dimensional Fresnel integral of the function 
gly) = e-G/25)6, Ly) F (y), that is,: 


/ eli/2h) ff ^ Gr ih f Vintr)addr e {ey( 0) + ade 
(t 


Pi 
=j¢ (i/2b)(4,(1-L)y 
t 


is well defined and it is equal to 
det(I — Ly? | e (ib/2)(y(I-Ly 


Moreover, it is a representation of the solution of 
equation [1] evaluated at x € Rf at time t. Recently, 
solutions of the Schrodinger equation with quartic 
anharmonic potential via infinite-dimensional oscil- 
latory integrals have been provided by S Albeverio 
and S Mazzucchi using a combination of Parseval 
formula and a new analytic method (the inclusion of 


WPMD abo (^(0) + x) d9 [13] 


? dur (7) 
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such potentials had been a stumbling block for many 
years). 

In this framework, it is possible to implement an 
infinite-dimensional version of the stationary-phase 
method and study the asymptotic behavior of the 
oscillatory integrals in the limit 5 — 0. 

The method of stationary phase was originally 
proposed by Stokes, who noted that when 5 — 0 the 
oscillatory integral [10] is O(b") for any n €NN, 
provided that there are no critical points of the 
phase function 6 in the support of the function f. As 
a consequence, one can deduce that the leading 
contribution to the integral [10] should come from a 
neighborhood of those points cc RN, such that 
V o(c) — 0. More precisely, by assuming that the set 
C of critical points is finite, that is, C= (c1,..., c4] 
and that every critical point is nondegenerate, that 
is, det D?®(c;) Z 0 Vc; € C, then one has 


DES HU [14] 


GEC 
where [7 : IR — C are C* functions of R, such that 


I; (0) = f(c) rib)" (det D^(c))) "^ 
If some critical point is degenerate, the situation is 
more complicated: one has to take into account the 
type of degeneracy and apply the theory of unfold- 
ings of singularities. 

These results can be generalized to infinite- 
dimensional oscillatory integrals of the form 


— 


I(b) = / eli/2hYe(I-L)s) W/W) se dy [15] 
H 


with v(x)— fy el g(x) = f, e e» dy(y), p, v 
being complex — A measures on H 
satisfying suitable assumptions and L:H—%H is a 
self-adjoint and trace-class linear operator, such that 
(I — L) is invertible. Under suitable growth condition on 
the moments of the measures jj, v and by assuming 
that the phase function ®(x) = (x, (I — L)x) — v(x) 
has a finite number of nondegenerate critical points 
C|,..-,Cs, It is possible to prove that the integral I(þ) 
in [15] 1s equal to 


b) = b» eU/P)9 (4) Te (h 
k=1 


for some C* functions I% satisfying: 


) + Io(b) 


I; (0) = [det(I — L — D^V(c4))]| "^ g(cu) 
Eut.:.& 
IP(0)—0, j—0,1,2,... 


312 Feynman Path Integrals 


Moreover, under some additional smallness assump- 
tions on v, it has been proved that the phase function 
® has a unique stationary point c and as 5 — 0 


I(b) $ds e(i/b) ®(c) 7+ (b) 


for some C? function I*. Each term of the 
asymptotic expansion in powers of 5 of the function 
I* can be explicitly computed, and it is possible to 
prove that such an asymptotic expansion is Borel- 
summable and determines /* uniquely. 

The application of these results to the infinite- 
dimensional oscillatory integral representation [13] 
for the solution of the Schródinger equation allows 
the study of its semiclassical limit. One has to 
consider a potential V that is the Fourier transform 
of a complex bounded variation measure u on (R4), 
such that fpa e!"l€d| 1|(B) «oo for some e» 0, and a 
particular form for the initial wave function 
wpo(x)=e/ ?oy(x), where œ is real and 
$, xE C (R^) are independent of b. This initial 
datum corresponds to an initial particle distribution 
po(x) = xl (x) and to a limiting value of the 
probability current ];—9 = Vó(x)po(x)/m, giving an 
initial particle flux associated to the velocity field 
Vó(x)/m. One also has to assume that the Lagrange 
manifold Ly = (y, — Vf) intersects transversally the 
subset Ay of the phase space made of all points (y, p) 
such that p is the momentum at y of a classical 
particle that starts at time zero from x, moves under 
the action of V, and ends at y at time ż. In this case, 
the Feynman path integral [13] has an asymptotic 
expansion in powers of h for b — 0, whose leading 
term is the sum of the values of the function 


95 f 
det | (2; (y", ») ) 
Oy 


taken at the points y such that a classical particle 
starting at y at time zero with momentum V¢(y") 
is at x at time £. S is the classical action along this 
classical path 4! and m” is the Maslov index of the 
path 7", that is, 72” is the number of zeros of 


Oy, 


as T varies on the interval (0, t). 


EI 
e G2)? &—i/b)5 e —G/B)0.. 


White-Noise Calculus 


The leading idea of the present approach, which was 
originally proposed by C DeWitt-Morette and 
P Krée and presently realized in the framework of 
white-noise calculus by T Hida, L Streit, and many 
other authors, is the realization of the Feynman 


integrand e/P5/0) as an infinite-dimensional distri- 
bution. This idea is similar to the one of the 
approach via Fourier transform, where the expres- 
sion (2zi) 7^? [n e» f(x)dx is realized as a 
distributional pairing between e!/2)(*)(27i)4/? and 
the function f € Z(R^) by means of the Parseval-type 
equality [9] and generalized to infinite-dimensional 
spaces. In white-noise calculus, the pairing is 
realized in a different measure space. Indeed, by 
manipulating the integrand in 


(2x1) 4? L eo x f(x)dx 


one has 


Í eli/2)(xx) F(x)d 
— dk 
Rd (2)?? 


ee e/a) ， 
=| —À3Á—— )tu e 16] 
where the latter line can be interpreted as the 
distributional pairing of 


e (2x) (1/2) x) 


:d/2 


and f not with respect to Lebesgue measure but 
rather with respect to the standard Gaussian 
measure 


e- (1/265) 
(Qn)? * 


on Rf. The RHS of [16] can be generalized to the 
case in which R^ is replaced by a path space, thanks 
to the fact that on infinite-dimensional spaces, even 
if Lebesgue measure is meaningless, Gaussian 
measures are well defined and can be used as 
reference measures. The detailed realization of this 
idea as well as its application to the mathematical 
realization of the Feynman integrand are rather 
technical and we certainly do not provide details 
here. We recall that this approach has been success- 
fully applied to the rigorous realization of Feynman 
path-integral formulation of Chern-Simons models. 


Other Possible Approaches 


Another possible mathematical definition of Feyn- 
man path integrals is based on Poisson measures. It 
was originally proposed by A M Chebotarev and 
V P Maslov and further developed by several 
authors such as S  Albeverio, Ph Blanchard, 
Ph Combe, R Hoegh-Krohn, M Sirugue, and 
V Kolokol'tsov. It can be applied to “phase-space 
integrals," to the Dirac equation and in particular 
algebraic settings, as well as to the Schrödinger 


equation, with potentials of the same type “Fourier 
transform of bounded measure” discussed in the 
subsection “Infinite-dimensional oscillatory integrals.” 

Another possible definition of Feynman path 
integrals is based on a “time-slicing” approximation 
and a limiting procedure, rather closed to Feynman’s 
original work based on Trotter product formula. 
The “sequential approach” was proposed originally 
by A Truman and further extensively developed by 
D Fujiwara and N Kumano-go. The paths ^ in 
formula [2] are approximated by piecewise linear 
paths and the Feynman path integral is correspond- 
ingly approximated by a finite-dimensional integral. 
In particular, D Fujiwara and N Kumano-go proved 
that the integrals defined in this way have some 
important properties, such as invariance under 
translations and orthogonal transformations. It is 
also possible to interchange the order of integration 
with Riemann-Stieltjies integrals and study the 
semiclassical approximation. 

Finally, it is worthwhile to recall a very interesting 
and intuitive approach to the Feynman integration 
which is based on nonstandard analysis. It was 
introduced by S Albeverio, J E Fenstad, R Heegh- 
Krohn, and T Linstrom in the 1980s, but it has not 
been systematically developed yet. 


Abbreviations 

D^ Heuristic Lebesgue-type measure on the space 
of paths 

Ps Wiener Gaussian measure on W; x 

$, Action functional 

S? Action functional for the free particle 

V Potential 

Wix Space of continuous paths with fixed initial 
point W, x = {w € C(0, t; R4) ::(0) =x} 

b Reduced Planck constant 
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d Phase function 

y Path, ^: [0, 7] ^ R? 

ji Fourier transform of the measure y 

v Wave function, solution of the Schrödinger 

equation 

H. Hilbert space 

See Fresnel integral on the Hilbert space H 

fs Infinite-dimensional oscillatory integral on the 
H Hilbert space H 

(,) inner product 

IMP norm 


See also: Chern-Simons Models: Rigorous Results; 
Euclidean Field Theory; Functional Integration in 
Quantum Physics; Path Integrals in Noncommutative 
Geometry; Quillen Determinant; Singularity and 
Bifurcation Theory; Stationary Phase Approximation. 
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Introduction 


Algebras and their representations are ubiquitous in 
mathematics. It turns out that representations of 
finite-dimensional algebras are intimately related to 
quivers, which are simply oriented graphs. Quivers 


arise naturally in many areas of mathematics, 
including representation theory, algebraic and dif- 
ferential geometry, Kac-Moody algebras, and quan- 
tum groups. In this article, we give a brief overview 
of some of these topics. We start by giving the basic 
definitions of associative algebras and their repre- 
sentations. We then introduce quivers and their 
representation theory, mentioning the connection to 
the representation theory of associative algebras. We 
also discuss in some detail the relationship between 
quivers and the theory of Lie algebras. 
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Associative Algebras 


An “algebra” is a vector space A over a field k 
equipped with a multiplication which is distributive 
and such that 
a(xy) =(ax)y=x(ay), Va€k, x,yeA 

When we wish to make the field explicit, we call A a 
k-algebra. An algebra is “associative” if (xy)z = x(yz) 
for all x, y,z€ A. A has a “unit,” or “multiplicative 
identity," if it contains an element 14 such that 
lax—x14-x for all x € A. From now on, we will 
assume all algebras are associative with unit. A is said 
to be *commutative" if xy — yx for all x,y € A and 
finite dimensional if the underlying vector space of A 
is finite dimensional. 

A vector subspace I of A is called a “left (resp. 
right) ideal” if xy€I for all xe A,yel (resp. 
x€l,ycA). If I is both a right and a left ideal, it 
is called a two-sided ideal of A. If I is a two-sided 
ideal of A, then the factor space A/I is again an 
algebra. 

An algebra homomorphism is a linear map 
f : A1— A2 between two algebras such that 


f (14) - 14, 
f (xy) ^ f (x)f (y); 


A representation of an algebra A is an algebra 
homomorphism p:A—End,(V) for a k-vector 
space V. Here End,(V) is the space of endomorph- 
isms of the vector space V with multiplication 
given by composition. Given a representation of 
an algebra A on a vector space V, we may view V 
as an A-module with the action of A on V given 


by 


Vx, yEA 


a-v = plap, acA, vev 

A morphism ::V — W of two A-modules (or 
equivalently, representations of A) is a linear map 
commuting with the action of A. That is, it is a 
linear map satisfying 


a-wiv)=wWa-v), VaeA, veV 


Let G be a commutative monoid (a set with an 
associative multiplication and a unit element). A 
G-graded k-algebra is a k-algebra which can be 
expressed as a direct sum A= @gceGAg such that 
aA, C Ag for all ack and Ay, Ag, C Ag, +g, for all 
21,22€G. A morphism v:A—B of G-graded 
algebras is a k-algebra morphism respecting the 
grading, that is, satisfying v(Ag) C B, for all 
geG. 


Quivers and Path Algebras 


A “quiver” is simply an oriented graph. More 
precisely, a quiver is a pair OQ= (Qo0, O1) where Oo 
is a finite set of vertices and OQ; is a finite set of 
arrows (oriented edges) between them. For a € O1, 
we let h(a) denote the “head” of a and t(a) denote the 
“tail” of a. A path in O is a sequence x = p1p2 .. . Pm 
of arrows such that h(p;,,;)=t(p;) for 1 €i € m — 1. 
We let t(x) — t(p,,) and b(x) — b(p4) denote the initial 
and final vertices of the path x. For each vertex 
i € Oo, we let e; denote the trivial path which starts 
and ends at the vertex ;. 

Fix a field k. The path algebra RO associated to a 
quiver O is the k-algebra whose underlying vector 
space has basis the set of paths in Q, and with the 
product of paths given by concatenation. Thus, if 
X= Pi: Pm and y—01...0, are two paths, then 
Xy — pi... PmO1--.On M b(y) = t(x} and xy — 0 other- 
wise. We also have 


€i ft f= j 
€;€; = 

0 ifizj 

x if b(x)=1 
eixX = 

0 if h(x) i 

x i dx)—i 
Xe; = 

Q iuf£x)zi 


for x € EO. This multiplication is associative. Note 
that e;A and Ae; have bases given by the set of paths 
ending and starting at i, respectively. The path 
algebra has a unit given by 2 ;eo, ei 


Example 1 Let O be the following quiver: 


then kO has a basis given by the set of paths 
{€1,€2, €3, €4, p,0, À, 0p]. Some sample products are 
pa —0,AA—0,Ao0 = 0, e30 = ez = 0,0650 —O. 


Example 2 Let O be the following quiver (the 
so-called “Jordan quiver”). 


p 


© 


1 


Then kO = k[t], the algebra of polynomials in one 
variable. 


Note that the path algebra RO is finite dimen- 
sional if and only if O has no oriented cycles (paths 
with the same head and tail vertex). 


Example 3 Let O be the following quiver: 


6——09——6- ++: —$—5—9——9 
1 2 3 n-2 n-1 n 


Then for every 1 €i Xj < n, there is a unique path 
from i to j. Let f : kO — M,,(k) be the linear map from 
the path algebra to the » x » matrices with entries in 
the field & that sends the unique path from i to j to the 
matrix E; with (j,i) entry 1 and all other entries zero. 
Then one can show that f is an isomorphism onto the 
algebra of lower triangular matrices. 


Representations of Quivers 


Fix a field k. A representation of a quiver O is an 
assignment of a vector space to each vertex and to 
each arrow a linear map between the vector spaces 
assigned to its tail and head. More precisely, a 
representation V of O is a collection 


{Vilie Qo} 


of finite-dimensional k-vector spaces together with a 
collection 


(V, : Vig) — Vote € Q1] 


of k-linear maps. Note that a representation V of a 
quiver O is equivalent to a representation of the 
path algebra &O. The dimension of V is the map 
dy : Qo > Zso given by dy(i) = dim V; for i € Oo. 

If V and W are two representations of a quiver O, 
then a morphism v: V — W is a collection of k-linear 
maps 


{Wi : V 一 Wili € Oo} 
such that 
Wowitp) = Voip) Vo, Vee Qi 


Proposition 1 Let A be a finite-dimensional 
k-algebra. Tben the category of representations of 
A is equivalent to tbe category of representations of 
tbe algebra kO/I for some quiver Q and some two- 


sided ideal I of kO. 


It is for this reason that the study of finite- 
dimensional associative algebras is intimately related 
to the study of quivers. 

We define the direct sum V $ W of two repre- 
sentations V and W of a quiver O by 


i € Qo 
and (V e W),: Vito) p Wip) —P Voip) p Whip) by 
(V & W),((v,w)) = (Vaw), Wp(w)) 


(Ve W); = Vid Wi, 
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for v € Vj, t! € Wup p € Q1. A representation V is 
“trivial” if V; —0 for all ¿€ Oo and “simple” if its 
only subrepresentations are the zero representation 
and V itself. We say that V is *decomposable" if it is 
isomorphic to W & U for some nontrivial represen- 
tations W and U. Otherwise, we call V *indecom- 
posable." Every representation of a quiver has a 
decomposition into indecomposable representations 
that is unique up to isomorphism and permutation 
of the components. Thus, to classify all representa- 
tiohs of a quiver, it suffices to classify the indecom- 
posable representations. 
Example 4 Let O be the following quiver: 
p 
e ———e 
1 2 


Then Q has three indecomposable representations 
U, V, and W given by: 


U;—-k,  U;-0, U,=0 
Vi=0,  Vi=k, V,=0 
Wi=k,  W;-k, W,=1 


Then any representation Z of O is isomorphic to 
Z2U%" e y^^ gw 


where d; = dim Zi, d; = dim Z2,r = rank Z,. 


Example 5 Let O be the Jordan quiver. Then 
representations V of O are classified up to iso- 
morphism by the Jordan normal form of V, where p 
is the single arrow of the quiver. Indecomposable 
representations correspond to single Jordan blocks. 
These are parametrized by a discrete parameter 1 
(the size of the block) and a continuous parameter A 
(the eigenvalue of the block). 


A quiver is said to be of “finite type” if it has only 
finitely many indecomposable representations (up to 
isomorphism). If a quiver has infinitely many 
isomorphism classes but they can be split into 
families, each parametrized by a single continuous 
parameter, then we say the quiver is of *tame" (or 
*affine") type. If a quiver is of neither finite nor 
tame type, it is of “wild type." It turns out that there 
is a rather remarkable relationship between the 
classification of quivers and their representations 
and the theory of Kac-Moody algebras. 

The *Euler form" or *Ringel form" of a quiver O 
is defined to be the asymmetric bilinear form on Z2° 
given by 


(a, 8) = X a(i)8() — > a(t(9))8(Co)) 
i € Qo p€Qi 
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In the standard coordinate basis of ZY, the Euler 
form is represented by the matrix E = (a;) where 


aj = 6; — Hp E Qi | t(p) =i, b(p) =f} 


Here 6; is the Kronecker delta symbol. We define 
the “Cartan form” of the quiver O to be the 
symmetric bilinear form given by 


(a, B) = (a, B) + (8, a) 


Note that the Cartan form is independent of the 
orientation of the arrows in Q. In the standard 
coordinate basis of Z9", the Cartan form is represented 
by the Cartan matrix C = (cj) where cj = aj; + aj;. 


Example 6 For the quiver in Example 1, the Euler 
matrix is 


if 8 d 
0 1-10 
F-lo 0 10 
6 ^0 -1 1 


and the Cartan matrix is 


oP" 8 -B 
al, 2 =f — 0 
C=P 9°41 2 1 
Das MA 2 


The “Tits form” q of a quiver O is defined by 


q(a) = (a, a) S 5 (a, a) 


It is known that the number of continuous para- 
meters describing representations of dimension a for 
a #0 is greater than or equal to 1 — g(a). 

Let g be the Kac-Moody algebra associated to 
the Cartan matrix of a quiver O. By forgetting the 
orientation of the arrows of O, we obtain the 
underlying (undirected) graph. This is the Dynkin 
graph of g. Associated to q is a root system and a set 
of simple roots {a; | i € Qo] indexed by the vertices of 
the Dynkin graph. 


Theorem 1 (Gabriel’s theorem). 


(i) A quiver is of finite type if and only if the 
underlying grapb is a union of Dynkin grapbs 
of type A, D, or E. 

(ii) A quiver is of tame type if and only if the 
underlying grapb is a union of Dynkin grapbs 
of type A, D, or E and extended Dynkin graphs 
of type A, D, or E (with at least one extended 
Dynkin graph). 

(iii) The isomorphism classes of indecomposable 
representations of a quiver O of finite type are 
in one-to-one correspondence with the positive 
roots of the root system associated to the 


underlying graph of Q. The correspondence is 
given by 


Vo jJ dy (1)a; 
iE Qo 


The Dynkin graphs of type A, D, and E are as follows. 


A, €—— —e————e- 09 5^ ——0———9 


D, €—— —9— —9- see 


NOE PN 


Here the subscript indicates the number of vertices in the 
graph. o 7 

The extended Dynkin graphs of type A, D, and E 
are as follows. 


iu. 
AAA ACA 
ee TERR 


Here we have used an open dot to denote the vertex 
that was added to the corresponding Dynkin graph 
of type A, D, or E. 


Theorem 2 (Kac’s theorem). Let O be an arbitrary 
quiver. The dimension vectors of indecomposable 
representations of Q correspond to positive roots 


of the root system associated to the underlying graph 
of O (and are thus independent of the orientation of 
the arrows of Q). The correspondence is given by 


dy > pa dy (i)a; 
iE Qo 


Note that in Kac's Theorem, it is not asserted that the 
isomorphism classes are in one-to-one correspondence 
with the roots as in the finite case considered in 
Gabriel's theorem. It turns out that in the general case, 
dimension vectors for which there is exactly one 
isomorphism class correspond to real roots while 
imaginary roots correspond to dimension vectors for 
which there are families of representations. 


Example 7 Let O be the quiver of type A,, 
oriented as follows. 


fA P2 Pn-2 Pn-1 
€———9 ———9- ^» —— —e—— — 6 
1 2 3 n-2 n-i n 


It is known that the set of positive roots of the 
simple Lie algebra of type A,, is 


他。 


t=] 


L<istembu(oy 


The zero root corresponds to the trivial representation. 
The root Y. ; Qi for some 1 € j € | € n corresponds 
to the unique (up to isomorphism) representation V 
with 


v= ifj<i<l 
0 otherwise 


and 


= ifj<i<l-1 
" 0 otherwise 


Example 8 Let O be the quiver of type A,,, with all 
arrows oriented in the same direction (for instance, 
counter-clockwise). The positive root »7 ga; 
(where {0,1,2,...,} are the vertices of the quiver) 
is imaginary. There is a one-parameter family of 
isomorphism classes of indecomposable representa- 
tions where the maps assigned to each arrow are 
nonzero. The parameter is the composition of the 
maps around the loop. 


If a quiver O has no oriented cycles, then the only 
simple kO-modules are the modules S! for i€ Qo 
where 
if i-j 
if izj 
and S^ — 0 for all p€ Q}. 
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Ringel-Hall Algebras 


Let k be the finite field F} with q elements and let 
O be a quiver with no oriented cycles. Let P be 
the set of all isomorphism classes of kO-modules 
which are finite as sets (since k is finite dimen- 
sional, these are just the quiver representations we 
considered above). Let A be a commutative 
integral domain containing Z and elements v,v^ 
such that v?=q. The Ringel-Hall algebra 
H=Hay(kO) is the free A-module with basis 
{{V]} indexed by the isomorphism classes of 
representations of the quiver O, with an A-bilinear 
multiplication defined by 


im V! dim V? 
三 
V 


Here (dim V+, dim V?) is the Euler form and gw y; 
is the number of submodules W of V such that 
V/W=V! and W2V?. H is an associative ze. 
graded algebra, with identity element [0], the 
isomorphism class of the trivial representation. 
The grading H = @,H, is given by letting Ha be 
the A-span of the set of isomorphism classes [V] 
such that dim V — o. 

Let C—CA,(kO) be the A-subalgebra of H 
generated by the isomorphism classes [S'] of the 
simple kO-modules. C is called the “composition 
algebra." If the underlying graph of O is of finite 
type, then C= H. 

Now let K be a set of finite fields k such that the 
set {|k||kEK} is infinite. Let A be an integral 
domain containing Q and, for each kEK, an 
element v, such that v? —|k|. For each REK, we 
have the corresponding composition algebra C,, 
generated by the elements [*S‘] (here we make the 
field k explicit), Now let C be the subring of 
Irex Ce generated by Q and the elements 


Lp = Vk 
-4 Aj 
tp 三 (2 ) 


ui, = [557], iE Oo 


= (te pe: 

=i a 

t = (t; )k e K 
u = (Ue rex 


Now, t lies in the center of C and if p(t) — 0 for some 
polynomial p, then p must be the zero polynomial 
since the set of v, is infinite. Thus, we may think 
of C as the A-algebra generated by the :',;€ Oo, 
with A=QJt,t-'] and f£ an indeterminate. Let 
C* — Q(t) &4 C. We call C* the “generic composi- 
tion algebra." 

Let g be the Kac-Moody algebra associated to the 
Cartan matrix of the quiver O and let U be the 
quantum group associated by Drinfeld and Jimbo to g. 
It has a triangular decomposition U = U- & U?& U+. 
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Specifically, U* is the Q(t)-algebra with generators 
E;, i € Qo and relations 


1—cj 


cy! s" etes", izj 


p=0 


where c;; are the entries of the Cartan matrix and 


MET 


|Éo-zr" 
fri 


p = [n]! = [1][2].. - [n] 


Theorem 3 There is a Q(t)-algebra isomorphism 
C* — UT sending uj — E; for all 1€ Qo. 


The proof of Theorem 3 is due to Ringel in the 
case that the underlying graph of O is of finite or 
affine type. The more general case presented here is 
due to Green. 

All of the Kac-Moody algebras considered so 
far have been simply-laced. That is, their Cartan 
matrices are symmetric. There is a way to deal 
with non-simply-laced Kac-Moody algebras using 
species. We will not treat this subject in this 
article. 


Quiver Varieties 


One can use varieties associated to quivers to yield a 
geometric realization of the upper half of the 
universal enveloping algebra of a Kac-Moody 
algebra g and its irreducible  highest-weight 
representations. 


Lusztig's Quiver Varieties 


We first introduce the quiver varieties, first 
defined by Lusztig, which yield a geometric 
realization of the upper half 4^ of the universal 
enveloping algebra of a simply laced Kac-Moody 
algebra g. Let O—(Oo, Q1) be the quiver whose 
vertices Oo are the vertices of the Dynkin 
diagram of q and whose set of arrows Q1 consists 
of all the edges of the Dynkin diagram with both 
orientations. By definition, 4^ is the Q-algebra 
defined by generators e;,i€ Oo, subject to the 
Serre relations 


1—cj; 


De p” Mee? = 


p=0 
for all ¿Æj in Oo, where ci are the entries of the Cartan 


matrix associated to Q. For any v= 5 jeo, vii, vi € IN, 
let Ut be the subspace of L^ spanned by the 


monomials  eje;...e; for various sequences 
ij,i2,..-5% in which i appears v; times for each 
i € Qo. Thus, U= U}. Let US be the subring of 
U+ generated by the elements e? /p! for i€ Qo, p EN. 
Then 47 = @ Uz, where UZ, =U} NUŻ. 

We define the involution : Qı — O4 to be the 
function which takes p€ Q; to the element of Qj 
consisting of the same edge with opposite orienta- 
tion. An orientation of our graph/quiver is a choice 
of a subset QC Q4 such that QUQ=Q, and 
Qn0- 9. 

Let V be the category of finite-dimensional 
Qo-graded vector spaces V= @ijco,V; over C 
with morphisms being linear maps respecting 
the grading. Then V € y shall denote that V is an 
object of V: The dimension of VEY is given by 
v= dim V = (dim Vo, To dim Va). 

Given V € y, let Ey be the space of representa- 
tions of O with underlying vector space V. That 
IS, 


Ey = G= Hom(V p), Vhio)) 


pe Qi 


For any subset Q} of Q4, let Ey, oy be the subspace 
of Ey consisting of all vectors x —(x,) such that 
x,—0 whenever p£O!|. The algebraic group 
Gy — [[; Aut(V;) acts on Ey and Ey, ey by 


(g.x) - ((gi). (xp)) Ft E 
= (x^) = (gp(p)xpgitp)) 


Define the function e: Qı — {—1, 1} by e(p) =1 for 
all p€Q and e(p)= —1 for all pe Q. Let (-,-) be the 
nondegenerate, Gy-invariant, symplectic form on 
Ey with values in C defined by 


(x,y) = ‘> E(p)tr(Xp¥p) 
p€ Qi 


Note that Ey can be considered as the cotangent 
space of Ey o under this form. 

The moment map associated to the Gy-action on 
the symplectic vector space Ey is the map 
p: Ey — gly = [[; EndV;, the Lie algebra of GLy, 
with i-component w;: Ey — EndV; given by 


w(x) = e(p)xpxp 
p€ Q;,h(p)=i 


Definition 1 An element xc€Ey is said to be 
nilpotent if there exists an N > 1 such that for any 
sequence pi, 2,..., pw in H satisfying t(p;) — (pz), 
t(p2)=h(p3),...,t(pn-1)=h(pn), the composition 
Kip Kip: vn Boe Malay V dia] BELO, 


Definition 2 Let Ay be the set of all nilpotent 
elements x € Ey such that v;(x) — 0 for all ie I. 


A subset of an algebraic variety is said to be 
"constructible" if it is obtained from subvarieties 
from a finite number of the usual set-theoretic 
operations. A function f:A— ( on an algebraic 
variety A is said to be a constructible function if f ^! (a) 
is a constructible set for all a € Q and is empty for all 
but finitely many a. Let M(Ay) denote the Q-vector 
space of all constructible functions on Ay. Let M(Ay) 
denote the Q-subspace of M(Ay) consisting of those 
functions that are constant on any Gy-orbit in Ay. 

Let V,V',.V"cy such that dimV = dim V’ + 
dim V". Now, suppose that $ is an I-graded subspace 
of V. For x € Ay we say that S is x-stable if x(S) C S. 
Let Ay. y’ y» be the variety consisting of all pairs (x, S) 
where x € Ay and S is an I-graded x-stable subspace of 
V such that dim $= dim V". Now, if we fix some 
isomorphisms V /S 2 V', S V". then x induces ele- 
ments x' € Ay and x" € Ay». We then have the maps 


pı pı 
Ay X Ayr «— Ay-y yr — Ay 


where p(x, S) =(x’,.x”), p2(x, S) — x. 

For a holomorphic map 7 between complex 
varieties A and B, let m denote the map between 
the spaces of constructible functions on A and B 
given by 


(mf)(y) = >》 ax(« (y) nf (a) 


acQ 


Let 7* be the pullback map from functions on B to 
functions on A acting as z^f(y) —f(z(y)). We then 
define a map 


M(Ay) x M(Ay») > M(Av) [1] 
by (f', f") — f * f" where 
ff" = (Pai x") 


Here f'xf'e€M(Ay x Ay) is defined by 
(f x f")(x', x") = f'(x')f"(x"). The map [1] is bilinear 
and defines an associative Q-algebra structure on 
OG ,M(Av») where V" is the object of V defined by 
V ECTS. 

There is a unique algebra homomorphism 
K:U* +@,M(Ay-) such that «(e;) is the function 
on the point Ay; with value 1. Then x restricts to a 
map kKj;:4; — M(Ay.). It can be shown that 
Spile? /p!) is the function 1 on the point Aysi for 
i € Oo, p € Zo. " 

Let Mz(Ay) be the set of all functions in M(Ay) that 
take on only integer values. One can show that if 
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f' € Mz(Ay') and f" € Mz(Ay»), then f' » f" € Mz(Av) 
in the setup of [1]. Thus «(U3 ,) C Mz(Ay»). 

Let IrrAy denote the set of irreducible compo- 
nents of Ay. The following proposition was con- 
jectured by Lusztig and proved by him in the affine 
(and finite) case. The general case was proved by 
Kashiwara and Saito. 


Proposition 2 For any v €(Zs0)2°, we have 
dim; = #IrrAy-. 


We then have the following important result due 
to Lusztig. 


Theorem 4 Let ve (Z>o)2. Then, 


(i) For any ZeElIrrAy, there exists a unique 
fz E rv(Uz „) such that fz is equal to 1 on an 
open dense subset of Z and equal to zero on an 
open dense subset of Z! ElrrAy» for all Z' Z Z. 

(ii) {fz | Z € IrrAy»} is a Q-basis of &,(U; ). 

(iii) ry :UT — KU; ) is an isomorphism. 

(iv) Define [Z] EU} by k,([Z])=fz. Then B,— 
([Z] x | Z € IrrAy»] is a Q-basis of U}. 

(v) Ky (A p) = Ky (UF ) n Mz(Ay). 

(vi) B, is a Z-basis of Uz „ 


From this theorem, we see that B= U, B, is a 
Q-basis of U*, which is called the *semicanonical 
basis." This basis has many remarkable properties. 
One of these properties is as follows. Via the algebra 
involution of the entire universal enveloping algebra 
U of g given on the Chevalley generators by 
e; — fif; e; and be —hb for b in the Cartan 
subalgebra of g, one obtains from the results of 
this section a semicanonical basis of W~, the lower 
half of the universal enveloping algebra of g. For any 
irreducible highest-weight integrable representation 
V of U (or, equivalently, g), let v € V be a nonzero 
highest-weight vector. Then the set 


[bv|b € B, bv £0} 


is a Q-basis of V, called the semicanonical basis of 
V. Thus, the semicanonical basis of 2f is simulta- 
neously compatible with all irreducible highest- 
weight integrable modules. There is also a way to 
define the semicanonical basis of a representation 
directly in a geometric way. This is the subject of the 
next subsection. 

One can also obtain a geometric realization of the 
upper part U* of the quantum group in a similar 
manner using perverse sheaves instead of construc- 
tible functions. This construction yields the canoni- 
cal basis of the associated quantum group (a 
q-deformation of the universal enveloping algebra) 
which also has many remarkable properties and is 
closely related to the theory of crystal bases. 
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Nakajima’s Quiver Varieties 


We introduce here a description of the quiver varieties 
first presented by Nakajima. They yield a geometric 
realization of the irreducible highest-weight represen- 
tations of simply-laced Kac-Moody algebras. The 
construction was motivated by the work of Kronhei- 
mer and Nakajima on solutions to the anti-self-dual 
Yang-Mills equations on ALE gravitational instantons 
(see Instantons: Topological Aspects). 


Definition 3 For v, w € Zo, choose I-graded vec- 
tor spaces V and W of graded dimensions v and tw, 
respectively. Then define 


A = A(v,w) = Ay x CO Hom(V;, Wj) 
icl 
Definition 4 Let A* — A(v,tw)' be the set of all 
(x,t) € A(v, t) satisfying the following condition: if 
S—(S; with S; C V; is x-stable and t;(S;)=0 for 
i€ I, then §;=0 for ic I. 


The group Gy acts on A(v, w) via 


(g, (x. £)) = (gi Xna) (tigr) 


and the stabilizer of any point of A(v,w)" in Gy is 
trivial. We then make the following definition. 


Definition 5 Let £ = £(v,w) = A(v,w)" /Gy. 


We should note that while the above definition 
and other constructions in this article are algebraic, 
there are also more geometric ways of looking at 
quiver varieties. In particular, the space 


M(v, w) = | e Hom(V p)» Vi) 


pEQI 


p (e Hom(W;, V;) 8 Hom(V;, w) 


iel 


has a natural hyper-Kahler metric and one can 
consider a hyper-Kahler quotient by the group 
ĮI U(V;). The variety £(v,w) is a Lagrangian 
subvariety of (and is homotopic to) this hyper- 
Kahler quotient. In the case g=sl,, the varieties 
involved are closely related to flag varieties. — 

Let w,v,V ,v' c Zl, be such that v—v +v". 
Consider the maps _ 


A(v",0) x Alv, w) — F(v, w; 0") 
P5 F(v, w; v) B A(v, w) [2] 


where the notation is as follows. A point of 
F(v,w;v") is a point (x,t)€ A(v,w) together with 
an I-graded, x-stable subspace S of V such that 


dim $ — v/ =v — v". A point of F(v,w;v") is a point 
(x,t, $) of F(v,w;v") together with a collection of 
isomorphisms R/:V.;2$; and R!:V; =V;/S; for 
each i € I. Then we define p»(x, t, S, R', R”) = (x, t, S), 
p3(x,t,S)=(x,t) and pi(x,5, SR, R') -(x" x, t) 
where x", x', t' are determined by 

Rx, - xo Ry Viu mz Ship) 

t; = tjR; : V; W, 
EA = Xp tp) Vito == Vp(p) Ship) 


It follows that x’ and x” are nilpotent. 


Lemma 1 One bas 
(p3 o pa)  (A(v,w)") C p,  (A(", 0) x A(v, w)*) 


Thus, we can restrict [2] to Ast forget the A(v", 0)- 
factor and consider the quotient by Gy and Gy. 
This yields the diagram 


Liu, w)  F(v,w;v — v) C(v,w) [3] 
where 
F(v.w,v— v) 
d Cx. t, S) e F(v,w;v — v’)|(x, t) € A(v,w)*/Gy 


Let M(L(v,w)) be the vector space of all 
constructible functions on L(v,w). Then define 
maps 


b; : M(£(v,w)) —^ M(L(v, w)) 
e; : M(L(v,w)) + M(L(v — e ,w)) 
f; : M(£(v — e,w)) —^ M(L(v, w)) 


by 
bif — uif 
eif = (mi) Gf) 
fig = (1a) (Tig) 
Here 


u —'(uo,...,u4) = w— Cv 


where C is the Cartan matrix of q and we are using 
diagram 3 with v/—v — e! where e' is the vector 
whose components are given by e Eds. 

Now let w be the constant function on £(0, w) 
with value 1. Let L(w) be the vector space of 
functions generated by acting on y with all possible 
combinations of the operators f; Then let 
L(v, w) = M(L(v, w)) N L(w). 


Proposition 3 The operators ei, fi, b; on L(w) provide 
it with the structure of the irreducible highest-weight 


integrable representation of g with highest weight 
2 /ieQs wiwi. Each summand of tbe decomposition 
L(w) — CB, L(v,w) is a weight space with weight 
2 jie Oo WiWi — viai. Here the w; and a; are tbe 
fundamental weights and simple roots of g, respectively. 


Let ZelIrrL(v,w) and define a linear map 
Tz:L(v,w)— C that associates to a constructible 
function f € L(v,w) the (constant) value of f on a 
suitable open dense subset of Z. The fact that 
L(v,w) is finite dimensional allows us to take such 
an open set on which any f € L(v,w) is constant. So 
we have a linear map 


Ó: L(v, w) -" Chre(vw) 


Then we have the following proposition. 


Proposition 4 The map ® is an isomorphism; for 
any ZelIrrL(v,w), there is a unique function 
gz € L(v,w) such that for some open dense subset 
O of Z we have gz|y—1 and for some closed 
Gy-invariant subset K C L(v,w) of dimension < 
dim L(v,w) we have gz=0 outside ZUK. The 
functions gz for ZelrrA(v,w) form a basis of 
L(v, w). 


Additional Topics 


To conclude, we have given here a brief overview 
of some topics related to  finite-dimensional 
algebras and quivers. There is much more to be 
found in the literature. For basics on associative 
algebras and their representations, the reader 
may consult introductory texts on abstract alge- 
bra such as Lang (2002). For further results (and 
their proofs) on Ringel-Hall algebras see the 
papers of Ringel (1990a, b, 1993, 1995, 1996) 
and of Green (1995) and the references cited 
therein. The reader interested in species, which 
extend many of these results to non-simply-laced 
Lie algebras, should consult Dlab and Ringel 
(1976). 

The book by Lusztig (1993) covers the quiver 
varieties of Lusztig and canonical bases. Canonical 
bases are closely related to crystal bases and crystal 
graphs (see Hong and Kang (2002) for an overview 
of these topics). In fact, the set of irreducible 
components of the quiver varieties of Lusztig and 
Nakajima can be endowed with the structure of a 
crystal graph in a purely geometric way (see 
Kashiwara and Saito (1997) and Saito (2002)). 
Many results on Nakajima’s quiver varieties can be 
found in the original papers (Nakajima 1994, 
1998). The overview article (Nakajima 1996) is 
also useful. 


Finite-Dimensional Algebras and Quivers 321 


Quiver varieties can also be used to give geometric 
realizations of tensor products of representations 
(see Malkin (2002, 2003), Nakajima (2001), and 
Savage (2003)) and finite-dimensional representa- 
tions of quantum affine Lie algebras (see Nakajima 
(2001)). This is just a select few of the many 
applications of quiver varieties. Much more can be 
found in the literature. 
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Introduction 


It is a commonplace situation that symmetric laws 
of Nature give rise to physical states which are not 
symmetric. States related by symmetry operations 
are equivalent, but still nature selects one of them. 

As an example, consider a ferromagnetic system 
of interacting spins with no external magnetic field. 
The “up” and “down” states are equivalent, but one 
of the two is chosen: the interaction makes states 
with agreeing spin orientation (and therefore macro- 
scopic magnetization) energetically preferred, and 
fluctuations will decide which state is actually 
chosen by a given sample. 

Finite group symmetry is also commonplace in 
physics, in particular through crystallographic 
groups occurring in condensed matter physics — but 
also through the inversions (C, P, T and their combi- 
nations) occurring in high-energy physics and field 
theory. 

The breaking of finite group symmetry has thus 
been thoroughly studied, and general approaches 
exist to investigate it in mathematically precise 
terms with physical counterparts. In particular, 
a widely applicable approach is provided by the 
Landau theory of phase transitions — whose 
mathematical counterpart resides in the realm 
of equivariant singularity and bifurcation theory. 
In Landau theory, the state of a system is 
described by a finite-dimensional variable (the 
“order parameter"), and physical states corre- 
spond to minima of a potential, invariant under a 
group. 

In this article we describe the basics of 
symmetry breaking analysis for systems described 
by a symmetric polynomial; in particular, we 
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discuss generic symmetry breakings, that is, those 
determined by the symmetry properties them- 
selves and independent of the details of the 
polynomial describing a concrete system. We 
also discuss how the plethora of invariant poly- 
nomials can be to some extent reduced by means 
of changes of coordinates, that is, how one can 
reduce to consider certain types of polynomials 
with no loss of generality. Finally, we will give 
some indications on extension of this theory, that 
is, on how one deals with symmetry breakings for 
more general groups and/or more general physical 
systems. 


Basic Notions 
Finite Groups 


A finite group (G, o) is a finite set G of elements 
{g0,-..,2N} equipped with a composition law o, and 
such that the following conditions hold: 


1. for all g, b € G the composition g o ^ belongs to 
G, that is, goh € G; 

2. the composition is associative, that is, (go) 
ok=go(hok) for all g, b, k € G; 

3. there is an element in G — which we will denote 
as e — which is the identity for the action of o on 
G, that is, eog=g=goe for all g € G; and 

4. for each g € G there is an element g ! which is 
the inverse of g, that is, g'!og=e=gog'!. 


In the following, we omit the symbol o, that is, we 
write gh to mean go b. Similarly, we usually write 
simply G for the group, rather than (G, o ). 

Given a subset H C G, this is a subgroup of (G, o) 
if (H, o) satisfies the group axioms (1)-(4) above. 
Note that this implies that e € H whenever H is a 
subgroup, and [e] is a subgroup. Subgroups not 
coinciding with the whole G and with {e} are said to 
be “proper.” 


Given two elements g, b we say that gpg is the 
conjugate of h by g. The conjugate of a subgroup 
HCG by g€G is the subgroup of elements 
conjugated to elements of H,gHg!={(ghg"'), 
b € H}. 


Group Action 


In physics, one is usually interested in a realization 
of an abstract group as a group of transformations 
in some set X; in physical applications, this is 
usually a (possibly, function) space or a manifold, 
and we refer to elements of X as “points.” That is, 
there is a map p: G++ End(X) from G to the group 
of endomorphisms of X, such to preserve the 
composition law: 


p(g) : p(h) = p(g o h) 


In this case, we say that we have a "representation" 
of the abstract group G acting in the "carrier" space 
or manifold X; we also say that X is a G-space or 
G-manifold. We often denote by the same letter the 
abstract element and its representation, that is, write 
simply g for p(g) and G for p(G). (In many 
physically relevant cases, but not necessarily, X has 
a linear structure and we consider linear endo- 
morphisms. In this case, we sometimes write T, for 
the linear operator representing g.) 

If x € X is a point in X, the G-orbit G(x) is the set 
of points to which x is mapped under G, that is, 


G(x) = {yE X: y=gx,g EG}CX 


Vg bceG 


Belonging to the same orbit is obviously an 
equivalence relation, and partitions X into equiva- 
lence classes. The “orbit space” for the G action on 
X, also denoted as 0 — X/G, is the set of these 
equivalence classes. It corresponds, in physical 
terms, to considering X modulo identification of 
elements related by the group action. 

For any point x € X, the “isotropy (sub)group" 
Gx is the set of elements leaving x fixed, 


Gs = {gE G: g@=x} CG 


Points on the same G-orbit have conjugated isotropy 
subgroups: indeed, y=gx implies immediately that 
Gy=gGxg"'. 

When a topology is defined on X, the problem 
arises if the G-action preserves it; if this is the case, 
we say that the G-action is “regular.” In the case of 
a compact Lie group (and a fortiori for a finite 
group) we are guaranteed the action is regular. 
(A physically relevant example of nonregular action 
is provided by the irrational flow on a torus. In this 
case G = R, realized as the time t irrational flow on 
the torus X= T*.) 
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Spontaneous Symmetry Breaking 


Let us now consider the case of physical systems 
whose state is described by a point x in the G-space 
or G-manifold X, with G a group acting by smooth 
mappings g: X — X. In physical problems, G quite 
often acts by linear and orthogonal transformations. 
(If this is not the case, the Palais-Mostow theorem 
guarantees that, for suitable groups (including in 
particular the finite ones) we can reduce to this case 
upon embedding X into a suitably larger carrier 
space Y.) 

Usually, G represents physical equivalence of 
states, and G-orbits are collections of physically 
equivalent states. A point which is G-invariant, that 
is, such that Gy = G, is called “symmetric” for short. 

Let ® be a scalar function (potential) defined on 
X,®:X — R, possibly depending on some para- 
meter ji, such that the physical state corresponds to 
critical points — usually the (local) minima — of ®. 

A concrete example is provided by the case 
where ® is the Gibbs free energy; more generally, 
this is the framework met in the Landau theory of 
phase transitions (Landau 1937, Landau and 
Lifshitz 1958). 

We are interested in the case where ® is invariant 
under the group action, or briefly G-invariant, that 
is, where 


(gx) = (x) Vx € X, VgEG [1] 

A critical point x such that G, — G is a symme- 
trical critical point. If G, is strictly smaller than G, 
then x is a symmetry-breaking critical point. 

If a physical system corresponds to a nonsym- 
metric critical point, we have a spontaneous 
symmetry breaking: albeit the physical laws (the 
potential function ®) are symmetric, the physical 
state (the critical point for ®) breaks the symmetry 
and chooses one of the G-equivalent critical points. 

It follows from [1] that the gradient of 4 is 
covariant under G. If y= g(x), then the differential 
(Dg) of the map g: X — X is a linear map between 
the corresponding tangent spaces, (Dg): TxX 一 T,X. 
The covariance amounts, with 7 the Riemannian 
metric in X, to (7/0; 5)(gx) = [(Dg), "0,9 ](x); this 
is also written compactly, with obvious notation, as 


(V®)(gx) = (Dg)|(V®)(x)] n 


(in the case of euclidean spaces (7=6) and linear 
actions described by matrices T,, the covariance 
condition reduces to (VE) (Tex) = (Tg); (VY (x)]). 
As (Dg) is a linear map, (V®)(x)=0 implies the 
vanishing of V® at all points on the G-orbit of x. 
We conclude that critical points of a G-invariant 
potential come in G-orbits: if x is a critical point for 
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®, then each y € G(x) is also a critical point for ®. 
We speak therefore of critical orbits for ®. 

It is thus possible (thanks to the regularity of the 
G-action), and actually convenient, to study sponta- 
neous symmetry breaking in the orbit space 
Q=xX/G rather than in the carrier manifold X 
(Michel 1971). 

If G describes physical equivalence, physical states 
whose symmetries are G-conjugated should be seen 
as physically equivalent. An equivalence class of 
isotropy types under conjugation will be said to be a 
symmetry type. We are thus interested, given a 
G-invariant polynomial ®, to know the symmetry 
types of its critical points. We denote symmetry 
types as [H] = (gHg '), and say that [H] < [K] if a 
group conjugated to H is strictly contained in 
a group conjugated to K. 

As we have seen, points on the same G-orbit have 
the same symmetry type. On the other hand, points 
on different G-orbits can have the same isotropy 
type (e.g., for the standard action of O(n) in R”, all 
collinear nonzero points will have the same isotropy 
subgroup but will lie on distinct group orbits). 


G-Invariant Polynomials 


Consider a finite group G acting in X. (Many of the 
notions and results mentioned in this section have a 
much wider range of applicability.) We look at the 
ring of G-invariant scalar polynomials in x!,..., x". 

By the Hilbert basis theorem, there is a set 
{ Ji(x),...5Je(x)} of G-invariant homogeneous poly- 
nomials of degrees {d,,...,d,} such that any 
G-invariant polynomial ®(x) can be written as a 
polynomial in the {1,..., /x}, that is, 


$(x) = U[Ji(x),---,Je(*)] i3] 


with V a polynomial. (A similar theorem holds for 
smooth functions.) 

The algebra of G-invariant polynomials is finitely 
generated, that is, we can choose k finite. When the 
J, are chosen so that none of them can be written as 
a polynomial of the others and r has the smallest 
possible value (this value depends on G), we say that 
they are a minimal integrity basis (MIB). (Note that 
some of the J, could be written as nonpolynomial 
functions of the others, and the /, could satisfy 
polynomial relations. For example, consider the 
group Z2 acting in R? via g:(x,y) 5 (—2x,—y) an 
MIB is made of Ji(x,y)—x?^,]o(x,y)— y?, and 
J3(x,y)=xy. None of these can be written as a 
polynomial function of the others, but Jı J2 — 2.) In 
this case, we say that the {Ją} are a set of basic 
invariants for G. There is obviously some arbitrar- 
ness in the choice of the J, in an MIB, but the 


degrees {d;,...,d,} of { Ji, .--, Jg} are fixed by G. (In 
mathematical terms, they are determined through 
the Poincaré series of the graded algebra Pc of 
G-invariant polynomials.) 

We will henceforth assume that we have chosen 
an MIB, with elements [(]/i,...,/,] of degrees 
{d,,...,d,} in x, say with di € dz € --- € d,. 

When the elements of an MIB for G are 
algebraically independent, we say that the MIB is 
regular; if G admits a regular MIB we say that G is 
coregular. 

An algebraic relation between elements J, of the 
MIB is said to be a relation of the first kind. The 
algebraic relations among the / are a set of 
polynomials in {J;,...,J,}, which are identically 
zero when seen as polynomials in x. If there are 
algebraic relations among these, they are called 
relations of the second kind, and so on. A theorem 
by Hilbert guarantees that the chain of relations has 
finite maximal length. (This is the homological 
dimension of the graded algebra Pc mentioned 
above.) 

In the following, we will consider a matrix built 
with the gradients of basic invariants, the P-matrix 
(Sartori). This is defined as 


Pin(x) = (VJi(x), VJp(x)) |4] 


with (.,.) the scalar product in T,X. 

The gradient of an invariant is necessarily a 
covariant quantity; the scalar product of two 
covariant quantities is an invariant one, and thus 
can be expressed again in terms of the basic 
invariants. Thus, the P-matrix can always be written 
in terms of the basic invariants themselves. 


Geometry of Group Action 


The use of an MIB allows to introduce a map J: x 一 
[Ja(x),..., Ju) from X to a subset P of R*. If 
the MIB is regular, P — R*, while if the J; satisfy 
some relation then Pc RK is the submanifold 
satisfying the corresponding relations. The manifold 
P is isomorphic to the orbit space O— X/G (the 
isomorphism being realized by the J map) and 
provides a more convenient framework to study €). 

As mentioned above, on physical terms we are 
mainly interested in the orbit space up to equiva- 
lence of symmetry type. The set of points in X (of 
orbits in Q) with the same symmetry type will be 
called a G-stratum in X (a G-stratum in Q); the 
G-stratum of the point x will be denoted as a(x) C X 
(the G-stratum of the orbit w as E(w) c Q). (The 
notion of stratum was introduced by Whitney in 
topology; a stratified manifold is a set which can 
be decomposed as the disjoint union of smooth 


manifolds of different dimensions, the topological 
(or Whitney) strata: M = LJ M}, with M* c 0M’ for 
all k <j.) 

It results that the G-stratification is compatible 
with the topological stratification. Indeed, P is a 
semialgebraic (i.e., it is defined by algebraic equal- 
ities and inequalities) stratified manifold in R*; the 
image of any G-stratum in Q belongs to a single 
topological stratum in P, and topological strata in P 
are the union of images of G-strata in Q. 

Moreover, the subgroup relations correspond to 
bordering relations between G-strata: if [Gy] < 
[G,], then e(y) € Oo(x) and (with wx the orbit of x) 
Llwy) € Elwy). 

There is a stratum, called the principal stratum oo, 
which corresponds to minimal isotropy, open and 
dense in X; similarly, the principal stratum X is 
open and dense in €). 


Landau Polynomial 


In the Landau (1937) theory of phase transitions, 
the state of the system under study is described by a 
G-invariant polynomial ®: X — R having a critical 
point in the origin, with at least some of its 
coefficients — in particular those controlling the 
stability of the zero critical point — depending on 
external control parameters (usually, X — R" and 
G C O(n); in particular, in solid-state physics G is a 
crystallographic group). This should be chosen as 
the most general G-invariant polynomial of the 
lowest degree / sufficient to ensure termodynamic 
stability; in mathematical terms, this amounts to the 
requirement that there is some open set B containing 
the origin and such that - for all values of the 
control parameters — V4 points inwards at all points 
of OB (i.e., B is invariant under the gradient flow 
of ®). If the polynomials in the MIB are of degree 
di € d;--- € d,, then usually / — 2d,. 

The G-invariance of ® and the results recalled 
above mean that we can always write it in terms of 
the polynomials in an MIB for G as in [3], 
P(x) = V|[J(x)]. 

The discussion of previous sections shows that we 
can study symmetry breakings for $:X — R by 
studying critical points of V : P — R; in other words, 
Landau theory can be worked out in the G-orbit 
space Q:— M/G. The polynomial Ų - providing 
a representation of the Landau polynomial in the 
orbit space — will also be called Landau-Michel 
polynomial. (Louis Michel (1923-1999) pioneered 
the use of orbit space techniques in physics and 
nonlinear dynamics, originally motivated by the 
study of hadronic interactions.) 


Finite Group Symmetry Breaking 325 


In this way, the evaluation of the map 6: X —^ R 
is, in principle, substituted by evaluation of two 
maps, J: X — P and V:P — R. However, if, as in 
Landau theory, we have to consider the most 
general G-invariant polynomial on X, we can just 
consider the most general polynomial on P. 


Critical Points of the Landau Polynomial 
and Geometry of Orbit Space 


The G-invariance has consequences on the critical 
points of ©. We have already seen one such 
consequence: critical points come in G-orbits. 

However, this is not all. Indeed, G-invariance 
enforces the presence of a certain set x(G) € X of 
critical points, and conversely if we look for points 
which are critical under any G-invariant potential, 
these are precisely the points in x(G); the critical 
points on x(G) correspond to critical orbits which 
we call principal critical orbits. 

The set x(G) can be determined on the basis 
of the geometry of the G-action. (A trivial 
example is provided by X —R and G=Z) acting 
via g:x — —x; any even function has a critical point 
in zero, and albeit even functions can, and in general 
will, have nonzero critical points, this is the only 
critical point common to all the even functions.) 
Indeed (Michel 1971): an orbit w is a principal 
critical orbit if and only if it is isolated in its 
stratum. 

For the linear orthogonal group actions in R" 
often occurring in physics, no nonzero point or 
orbit can be isolated in its stratum. However, 
we can quotient out the radial degeneracy and 
work on X — $"7! c R”. In this case, a G-orbit wi 
in $"-! which is isolated in its stratum corresponds 
to a one-dimensional family {w,} of G-orbits in 
R" (call Xo the corresponding submanifold in X); 
the gradient of ® at x € Xo points along TxXo. 
We can thus reduce to consider the restriction 
o of the potential ® to X). (See also the 
reduction lemma of Golubitsky and Stewart in this 
context.) 

Correspondingly, if Po C P is the submanifold in 
P image of Xo, that is, Po = J(Xo), we can reduce to 
consider the restriction V, of Y to Po. 

As these become one-dimensional problems, 
general results are available. In particular, one 
can provide general conditions ensuring the 
existence of one-dimensional branches of symme- 
try-breaking solutions bifurcating from zero along 
any such Xo or Po; this is also known as the 
equivariant branching lemma of Cicogna and 
Vanderbauwhede. 
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Reduction of the Landau Potential 


In realistic problems, ® quickly becomes extremely 
complicated, that is, it includes a high number of 
terms and therefore of coefficients. A thorough 
study of different symmetry-breaking patterns, that 
Is, of the symmetry type of minima of 6 for different 
values of these coefficients and of the external 
control parameter, is in this case a prohibitive task. 
It is possible to reduce the generality of the Landau 
polynomial with no loss of generality for the 
corresponding physical problem. Indeed, a change 
of coordinates in the X space will produce a 
formally different — but obviously equivalent — 
Landau polynomial; it is convenient to use coordi- 
nates in which the Landau polynomial is simpler. 

A systematic and algorithmic reduction procedure - 
based on perturbative expansion near the origin — is well 
known in dynamical systems theory (Poincaré-Birkhoff 
normal forms), and can be adapted to the reduction 
of Landau polynomials. (An alternative and more 
general — but also much more demanding — approach 
is provided by the spectral sequence approach, also 
originating in normal-form theory.) 

We work near the origin, so that we can assume 
X — R" (with metric 7), and for simplicity we also 
take the case where G acts via a linear representa- 
tion T,. We consider changes of coordinates of the 
(Poincaré) form 


x! = y + b'(y) [5] 
generated by a G-invariant function H: hi(y) =n" 
H(y)/Oy’); this guarantees that [5] preserves the 


G-invariance of 9. The action of [5] on 9 can be read 
from its action on the basic invariants /;. It results 


Ja(x) = Jaly) + (6Ja)(y) 
ôJa := Pap(OH/O]p) 
Let us now consider the reduction of an invariant 
polynomial ®(x) = W( J). We write Da := 0/0],, and 
understand that summation over repeated indices is 
implied. In general, 


V(J) — w(] +4) 
+ 3 Jg a 


where the ellipsis means higher-order terms. 
Disregarding higher-order terms and using [6] and 
[4], we get 


[6] 


Ow OH 


—— 4 was DW Paj DH 7 
7n Ag ( ) & B ) [7] 


We expand ® as a sum of homogeneous poly- 
nomials, and write ®(x)= T à o (x), where 


®, (ax) =a*t'@,(x). Also, write V — Y^, Wg, where 
S(x) := V,[ J(x)]. 

It results that under a change of coordinates [5] 
generated by H — H,, homogeneous of degree m + 1, 
the terms V, with k € m are not changed, while the 
terms W,,,,, change according to 


Ymp am? Wintp 
= V5.5 + (DaVp)Paa(DaHm) +-+- [8] 


We can then operate sequentially with H,, of 
degree 3,4,...; at each stage (generator Hm), we are 
not affecting the terms V, with k < m. Moreover, 
we can just consider [8], as higher-order terms are 
generic and will be taken care of in subsequent 
steps. (This procedure requires to determine suitable 
generating functions H,,; these are obtained as 
solutions to homological equations.) 

In the above, we disregarded the dependence on the 
control parameters, such as temperature, pressure, 
magnetic field, etc; that is, we implicitly considered 
fixed values for these. However, they have to change 
for a phase transition to take place. If we consider a full 
range of values — including in particular the critical 
ones — for the control parameters, say A € A, we should 
take care that the concerned quantities and operators 
are nonsingular uniformly in A. 

This leads to reduction criteria for the Landau and 
Landau-Michel polynomials (Gufan). Define, for 
i= 1,..., k the quantities U;( J4, ...,],) := (OF/0].)P,. 


Reduction Criterion 


For ®(x) = V(]1,..., Jk): R" — Ra G-invariant poten- 
tial depending on raia parameters À € A, there is a 
sequence of Poincaré changes of coordinates such that 
® is expressed in the new coordinates y as ®(y) = W( J), 
where terms which can be written (up to higher-order 
terms) uniformly in A as es ase d Inl 15:4) 
Ual ji, ett aJil with On polynomials in n va fi 
satisfying the compatibility condition (005/0]4) = 
(OO, /OJa), are not present in V. 


Nonstationary and Nonvariational 
Problems 


So far we have considered stationary physical states. 
In some cases, one is not satisfied with such a 
description, and wants to study time evolution. A 
model framework for this is provided by the 
Ginzburg-Landau equation 


X = f(x) 9] 


where f =7(V®): X — TX (see above for notation). 
In this case, G-invariance of ® implies equivariance 


of [9]. More generally, we can consider [9] for an 
equivariant smooth f (not necessarily a gradient), 
that is, f'(gx) = (Dg);f' (x). 

In this case, one shows that 


f(x) € T,a(x) [10] 


so that closures of G-strata are dynamically invar- 
iant, and the dynamics can be reduced to them. This 
is of special interest for the “most singular” strata, 
that is, those of lower dimension. The reduction 
lemma and the equivariant branching lemma men- 
tioned above also hold (and were originally for- 
mulated) in this context. 

The relation [10] also implies that one can project 
the dynamics [9] in X to a smooth dynamics p = F(p) 
in the orbit space; this satisfies F[J(x)]=(DJ)| f(x)]. 
In the gradient case, this (together with initial 
conditions) embodies the full dynamics in X, while 
in the generic case one loses all information about 
motions along group orbits (note that these corre- 
spond to phonon modes). 

An orbit w isolated in its stratum is still an orbit of 
fixed points for any G-equivariant dynamics in X in 
the gradient case, while in the generic case it 
corresponds to a fixed point for F and to relative 
equilibria (dynamical orbits which belong to a single 
group orbit) in X. In this case, time averages of 
physical quantities can be G-invariant for nontrivial 
relative equilibria. 


Extensions and Physical Applications 


We have discussed finite group symmetry breaking 
and focused on polynomial potentials (which can be 
thought of as Taylor expansions around critical 
points). For nonfinite groups, and in particular 
noncompact ones, the situation can be considerably 
more complicated. 


1. An extension of the theory sketched here is 
provided by Palais’ theory, and in particular by 
his “symmetric criticality principle,” which 
applies in Hilbert or Banach spaces of sections 
of a fiber bundle satisfying certain conditions. 
This is especially relevant in connection with field 
theory and gauge groups. 

2. We focused on the situation discussed in classical 
physics. Finite group symmetry breaking is of 
course also relevant in quantum mechanics; 
this is discussed, for example, in the classical 
books by Weyl (1931) and Wigner (1959), and in 
the review by Michel et al. (2004). 

3. One speaks of “explicit symmetry breaking" 
when a nonsymmetric perturbation is introduced 
in a symmetric problem. In the Hamiltonian 
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case (or in the Lagrangian one for Noether 
symmetries), Hamiltonian symmetries correspond 
to conserved quantities, and nonsymmetric 
perturbations make these become approximate 
constants of motion. 

4. The symmetry of differential equations — as well 
as symmetric and symmetry-breaking solutions for 
symmetric equations — can be studied in general 
mathematical terms (see, e.g., Olver (1986)). 

5. Physical applications of the theory discussed here 
abound in the literature, in particular through the 
Landau theory of phase transitions. A number of 
these, together with a deeper discussion of the 
underlying theory, is given in the monumental 
review paper by Michel et al. (2004). 


See also: Central Manifolds, Normal Forms; Compact 
Groups and Their Representations; Electroweak Theory; 
Finite Group Symmetry Breaking; Phase Transitions in 
Continuous Systems; Quasiperiodic Systems; Symmetry 
and Symmetry Breaking in Dynamical Systems; 
Symmetry Breaking in Field Theory. 
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Introduction 


Finite Weyl systems have their applications in 
various branches of quantum information theory. 
They are helpful to tame the growth of complexity 
for a large class of quantum systems: a key 
discrepancy between classical and quantum:systems 
is the difference in the growth of complexity as one 
goes to larger and larger systems. This is encoun- 
tered by simulating a quantum spin system on a 
computer, for example, with the aim to determine 
the ground state of a solid-state model of magnet- 
ism. For a model of N classical spins, this involves 
checking the energy for 2" different configurations, 
but for a model with quantum spins it requires the 
solution of an eigenvalue equation in a Hilbert space 
of dimension 2, which is a vastly more difficult 
problem for large N. For a three-dimensional lattice, 
three sites each way (N — 27), this is a problem in 
105 dimensions, and lattice size 4 leads to utterly 
untractable 10!?^ dimensions. 

It is therefore highly desirable to find ways of 
treating at least some aspects of large, complex 
quantum systems without actually having to write 
out state vectors component by component. States 
which are invariant under a suitable discrete abelian 
symmetry group satisfy this condition. They can be 
characterized by simple combinatorial data, which 
do not grow exponentially with the system size N. 
At the same time, the class of these so-called 
stabilizer states is sufficiently complex to capture 
some of the key features needed for computation, 
especially the quantum correlation (entanglement) 
between subsystems. They have also been shown to 
be sufficient to generate large quantum error 
correcting codes. 

A further motivation for finite Weyl systems is 
directly based on constructing quantum error cor- 
recting codes from classical coding procedures (see 
Quantum Error Correction and Fault Tolerance). 
The *quantization" technique which is used there 
naturally leads to the structure of finite Weyl 
systems. 

Finite Weyl systems precisely represent quantum 
versions of discrete abelian symmetry groups. It is a 
standard procedure to build the quantum version of 
a symmetry group by an appropriate central exten- 
sion, or equivalently, to study all its projective 


representations: the composition of two symmetry 
transformations is only preserved up to a phase on 
the representation Hilbert space. The unitary opera- 
tors which represent the symmetry transformations 
are called Weyl operators. 

The simplest and most prominent example for a 
finite Weyl system is given by the three Pauli 
matrices and the identity. These four unitary 
operators build a projective representation of the 
symmetry group of binary vectors (0,0), (0,1), 
(1,0), (1, 1), where the group law is the addition 
modulo two. The null-vector (0,0) corresponds to 
the identity, the vector (0, 1) is assigned to X, (1,0) 
corresponds to Z, and (1,1) is mapped to iY. It is 
not difficult to verify that the product of two Pauli 
operators preserves the addition of binary vectors up 
to a phase. 

Discrete Weyl systems are deeply related to 
symplectic geometry for vector spaces over finite 
fields. The additive structure of the vector space is 
the underlying abelian symmetry group. The 
exchange of two Weyl operators within a product 
produces a phase that is the exponential of an 
antisymmetric bilinear form, as it is explained in the 
next section. For irreducible Weyl systems, this 
antisymmetric form must be symplectic because the 
Weyl operators generate a full matrix algebra. In 
particular, this requires that the dimension of the 
underlying vector space is even. The Pauli matrices 
are also an example for this more special structure: 
the binary vectors (p, 4)p,4=0,1 are a two-dimensional 
vector space over the field with two elements {0, 1}. 
The commutation relations for Pauli operators 
imply that the symplectic form can be evaluated 
for two binary vectors (p,q),(p',q') according to 
pq' —qp'mod2. It is apparent to interpret the 
binary vectors (p,q) as points in a discrete phase 
space, where the first entry corresponds to the 
momentum and the second to the position. In view 
of this, discrete Weyl systems serve as a finite- 
dimensional analog. of the canonical commutation 
relations. 

For the generic situation in quantum information 
theory, an irreducible Weyl system is represented on 
the Hilbert space describing a system of several 
single particles. Stabilizer states are left unchanged 
under the action of a so-called isotropic subgroup 
which consists of mutually commuting Weyl opera- 
tors: this kind of invariance is precisely the type of 
constraint that reduces the complexity for the 
parametrization of the state. For an efficient 
description of such states, there are combinatorial 
techniques available e.g., graph theory. 


Operations that preserve the class of stabilizer 
states (for a particular symmetry group) must be 
covariant with respect to this symmetry. These 
operations are called Clifford channels which have 
far-reaching applications in the theory of quantum 
error correction. They also allow to take classical 
coding procedures and turn them into quantum 
codes: on the classical level, the encoding operation 
acts on classical phase space as a linear map 
(additive code). Up to a choice of phases, this 
induces a quantum channel that preserves the 
structure of Weyl systems. These codes are called 
stabilizer codes and have been investigated by many 
authors (Calderbank et al. 1997, Cleve and Gottesman 
1996, 1997) (see Quantum Error Correction and 
Fault Tolerance). In particular, the first quantum 
error correcting codes belong to this class. 

This article is organized as follows. In the next 
section, the basic mathematical notions are provided, 
like projective representations, Weyl systems, and 
irreducibility. Moreover, statements on the main 
structure of Weyl systems are presented. Next, the 
notion of Weyl covariant channels (Clifford channels) 
is introduced and their basic properties are stated. In 
particular, stronger results for the reversible case are 
given. The relation between symplectic geometry and 
reversible Clifford operations on finite Weyl systems 
is explained. Results on the general structure of 
stabilizer states and stabilizer codes are given in the 
penultimate section. Finally, the representation of 
stabilizer codes in terms of graphs is described. 


Finite Weyl Systems 


A projective representation of a group © assigns to 
each group element € a unitary operator w(€) on a 
Hilbert space H such that the group law is preserved 
up to a phase, that is, the relation 


w(£1 + €2) = f (1, €2)w(£1)w(£2) [1] 


is fulfilled for a phase-valued function f on Z?. In the 
following, we denote a projective representation by a 
triple (w,f,7#1). A finite Weyl system is a projective 
representation of a finite abelian group. The opera- 
tors w(€) are called Weyl operators and the function f 
is called the factor system. We refer to the work by 
 Zmud (1971, 1972) for an analysis of projective 
representations for general abelian groups. 

The Weyl algebra (1w,f,H) associated with a 
Weyl system (w,f,#) is the smallest norm-closed 
subalgebra in the space of bounded operators B(H) 
which contains all Weyl operators. If the Weyl 
algebra coincides with the algebra of all bounded 
operators, then the Weyl system is called irreducible. 
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This is equivalent to the fact that each operator that 
commutes with all Weyl operators must be a 
multiple of the identity. 

In order to analyze the properties of factor 
systems systematically, we introduce here a few 
pieces of the cohomology theory of groups. For each 
positive integer &—1,2,3,... we introduce the 
abelian group C*(=) of k-cochains which consists 
of all phase-valued functions on E*. The product 
and the inverse of k-cochains is defined pointwise. 
Factor systems are special 2-cochains. Namely, if we 
consider a Weyl system (tw,f,71), then associativity 
implies that the so-called 2-cocycle condition, 


f(& + £a, €3)F (62,63) F(E1,62 +6) !f(&,&) =1 (2] 


holds. This property can also be expressed by a 
coboundary map 6 which is a group homomorphism 
from k-cochains to (k + 1)-cochains. We consider here 
the action of the coboundary map on a 1-cochain o 
and a 2-cochain f: 


(Oy) (E1, €2) = V(r + &) (61) v(&) ! — [3] 


(Of) (£1, £2, 63) :— f (£ +, &x)f (£2, 63) 
x fln é +E) 'f(&.€2) [4] 


The group of 2-cocycles Z*(=) consists of all 
2-cochains f with ðf —1 and the group of all 
2-coboundaries B? (Æ) contains all 2-cochains of 
the form f=6y. The 2-fold concatenation of the 
coboundary map is the trivial homomorphism 
505=1, which implies that each 2-coboundary is 
a 2-cocycle. The converse is in general not the case 
and the 2-cohomology group H*(=):= Z*(=)/B?(=) 
is nontrivial. 

The Zmud (1971, 1972) analysis shows that the 
set of Weyl systems are characterized by elements of 
the 2-cohomology H?(=). The multiplication of a 
Weyl system (w, f, H) by a 1-cochain y yields a new 
family of Weyl operators (qww)(£)— (£)w(£). The 
2-cocycle f is altered by the multiplication of the 
2-coboundary 5% and the new Weyl system is given 
by (pw,dyf,H). This kind of transformation does 
not change the cohomology class of the factor system 
and the corresponding Weyl algebras coincide: 
Alw, f,H) = 9M(uow, df, H). Thus, the fundamental 
properties of a Weyl system only depend on the 
cohomology class of the factor system. In particular, 
if the factor system f =ð% is a 2-coboundary, then 
we can trivialize the Weyl system (iw,5%,H) by 
multiplying the inverse 1-cochain ^! and we obtain 
a true unitary representation (q^ !w,1,H). The 
corresponding Weyl algebra 2(w,@y,H) is abelian. 
The relation between cohomology and Weyl systems 
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can be made even more precise by the following 
theorem: 


Theorem 1 (Zmud 1971, 1972). 0 is the group 
homomorphism on 2-cochains that exchanges the 


variables: (0f (£1, £2) = f (£, £1). 


(i) The antisymmetric part f^(0f) of a factor 
system (2-cocycle) is an antisymmetric bichar- 
acter, that is, a group homomorphism in both 
arguments keeping the other variable fixed. 

(ii) Each symmetric 2-cocycle f —0f is a 
2-coboundary f = y. 

(iii) The group of antisymmetric bicharacters on = is 
isomorphic to the 2-cobomology group H*(E). 
For each antisymmetric bicharacter o the corre- 
sponding 2-cohomology class is uniquely deter- 
mined by o=f (0f) for some representative 
f ez. 


Example 2 The following Weyl system describes 
n-quantum digits (in short qudits). The system's 
Hilbert space is spanned by orthonormal vectors 
la) = |a1,42,...,a&) which are labeled by vectors a 
of the additive group F”, where F = Z4 is the cyclic 
field of prime order. A projective representation 
(w, x, C^) of the additive group F?” is given by 


w(p, q)|a) = e" "|a + q) i5] 


where p' is the transposed vector. The factor system 
x assigns to each pair (p,q), (p', q') the phase 


x(p, lp’, q') :- e?7/dP'a [6] 


The finite vector space F^" is interpreted as finite 
phase space with a multiplicative symplectic form o. 
It assigns to a pair of vectors (p, q), (a, b) the phase 


o (p, q|a, b) = e/o [7] 


The commutation relation for Weyl operators 
comprise the symplectic form: 


w(p, q)w(a, b) = ola, blp, qyw(a, b)w(p,q) [8] 


The d?" Weyl operators w(p,q) are a basis of 
the algebra of all operators acting on the Hilbert 
space C^, hence (w,x, CH) is irreducible. In 
particular, this Weyl system is a nice error basis in 
the sense of (Klappenecker and Roetteler 2002, 
2005). Namely, the Weyl operators form a projec- 
tive representation, on the one hand, and a unitary 
basis (Werner 2001) on the other. 

For d —2 and n=4, we obtain a system of four 
qubits and the Weyl operators are tensor products of 
four Pauli matrices including the identity. For 
instance, the Weyl operator of the binary vector 


(p,q) — (0011, 1010) can be expressed in terms of 
Pauli matrices (see Introduction) as follows: 


10(0011, 1010) = w(0,1) & 1 & w(1,1) &w(1,0) 
=1IX@1@Y@Z [9| 


Clifford Channels 


Weyl systems can be seen as quantized symmetries 
corresponding to finite abelian groups. In the 
Heisenberg picture the symmetry transformations 
act on operators A € B(H) of the observable algebra 
by automorphisms (reversible quantum channels): 


Ad[w(£))(A) :— w(£)Aw(£)" [10] 


Since a projective representation preserves the group 
law up to a phase, the corresponding automorph- 
isms preserve the group law: 


Ad[w(8)] o Ad[w(n)] = Adjw(E+m)] [41] 


A quantum channel T is called a Clifford channel if 
it is covariant with respect to Weyl systems 
(w1, fi, H1) and (w2, f, H2), that is, the intertwiner 
relation 


T o Ad[w2(8)] = Ad[wi(£)] o T 12] 


holds. It is required that the antisymmetric part of 
the factor systems fı and f coincide, that is, 
ae f 6h fr 85. We call (wi,fi, Hi1) the input 
and (w2, f2, H2) the output system. We refer to the 
article by Scutaru (1979), which is concerned with 
the general properties of covariant channels. 

It is a natural question to ask how Clifford 
channels act on Weyl operators. As shown by 
Holevo (n.d.), a Clifford channel maps Weyl 
operators of the output system to multiples of a 
Weyl operators of the input system, provided the 
input system is irreducible. 


Theorem 3 (Holevo (n.d.)) Let T be a Clifford 
channel such that the input system (wi,fi, Hı) is 
irreducible. Then there exists a function :E—C 
such that 


T(w2(€)) = p(€)wri(€) [13] 


holds for all €€ =. The function p is of positive 
type, that is, for all complex functions f on = the 
inequality 


0 < 2,v(6 - MFE (n) [14] 


holds. Conversely, if the factor systems fi=f 
coincide, then a well-defined channel is determined 


by [13] for any function p of positive type with 
(0) = 1. 


We apply Theorem 3 to a reversible Clifford 
channel T. Each output Weyl operator w(€) is 
mapped to a multiple of an input Weyl operator 


T(w»;(£)) = e(£)wu(£) [15] 


where q is phase-valued (a 1-cochain) according to 
the reversibility of T. We focus now on the converse 
problem: construct all reversible Clifford channels 
for irreducible Weyl systems that have a common 
antisymmetric part of the factor system. The 
following theorem gives a useful characterization 
of reversible Clifford channels. 


Theorem 4 (Schlingemann and Werner 2001). If 
(wi, fi, Hı) and (w2,fo,H2) are irreducible Weyl 
systems with f, (0f) — f; (0/2), then there exists a 
1-cochain p with coboundary dp=f,'f2, and a 
reversible Clifford channel T, is determined by 


T,(w2(§)) = v(&)wi(£) (16) 


If 7 is a 1-cochain that also satisfies or =f, f», then 
there exists 1 € = such that 


T(E) = o(ml€)e(€) [17] 
T, = Ad|w:ı (n)] o T; = T; o Ad[w:(n)] [18] 


holds. In other words, two irreducible Weyl systems 
determine a reversible Clifford channel up to a 
“phase space translation n.” 


We consider the Weyl system (w,f,H) over a 
discrete phase space F°”, where F is a finite field of 
prime order. The group of symplectic transforma- 
tions Sp(z, F) consists of all F-linear maps s on the 
phase space F^" that preserve the symplectic form 
co —f 0f. A further Weyl system (wo s,f os, H) is 
obtained for each symplectic transformation s. Here 
the factor system f os is defined according to (f o s) 
(£, n) :— f (s£, s) and the corresponding Weyl opera- 
tors are (two s)(£) — 1w(s£). Obviously, the antisym- 
metric part of the factor system fos is the 
symplectic form g os- o. The following statement 
Is a direct consequence of Theorem 4. 


Corollary 5 For each symplectic transformation 
s E€ Sp(n, F) there exists a 1-cochain y with 
coboundary @p=f (fos) and the corresponding 
reversible Clifford channel T, sı is given by 


Ti, 4(w(£)) = v(&)w(sE) [19] 
with £, n € F”. 


Example 6 We consider a finite field F. To a 
symmetric matrix I € M,(F) we associate the 
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symplectic transformation on F°” that maps a 
phase space vector (p,q) to (p —I'q,q). This shear 
transformation is viewed as one elementary step of a 
discrete dynamics. The quantized version of this 
dynamics is given by the unitary multiplication 
operator 


u(L)|a) = c1. *|a) [20] 


with the root of unity 6; = exp(iz(d + 1)/d) for 
d 42 and (5-—i. The unitary operator (I) 
implements a reversible Clifford operation for the 
symplectic transformation — (p,q) — (p — Iq, q) 
since the relation 


u(P)w(p,q)u(V)' = $!^w(p—Tq,q) [21] 


holds. The symmetric matrix I describes a pattern 
of two-qudit interactions. This can be visualized by 
a graph T whose vertices are the positions 
x,y —1,...,". Two vertices x, y are connected by 
an edge if the matrix element I5 4 0 is nonvanish- 
ing. The value of the matrix element I7 is 
interpreted as the strength of the interaction. 


Example 7 The second type of symplectic trans- 
formations, which is relevant here, is determined by 
an invertible matrix C € M,(F). It induces a 
symplectic transformation which maps the vector 
(p,q) to (Cq,—Cp), where C is the inverse of the 
transpose of C. This is implemented by a unitary 
transformation Fjcj. It is called the Fourier trans- 
form associated with the invertible matrix C: 


1 T" 
Figlp) — Vili -R e/ap calg) [22] 


qek" 
By construction, the relation 
Fiaqw(p,q)Fia =e OP'4w(Cq,—Cp) [23] 


follows. If C= diag(ci,...,c,) is a diagonal matrix, 
then Fic] is a local unitary transformation. In fact, 
the Fourier transform is a tensor product 


Fic) = Fia] ® Fej 9-6 Fie, [24] 


with c, € F\0, where the tensor product structure is 
determined by |q) =|q1) & --- & lån). 


The Stabilizer Formalism 


This section is dedicated to the stabilizer formalism, 
which has widely been discussed in the literature 
(Calderbank et al. 1997, Gottesman 1996, 1997). 
We investigated here stabilizer codes from a point of 
view of symmetries and show how they can be 
characterized by Clifford channels. We verify that 
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stabilizer codes are specific Clifford channels in the 
sense described in the last section. To begin with, we 
consider an irreducible Weyl system (w,f,#1) of an 
even-dimensional F-vector space = such that the 
antisymmetric part of the factor system o:=f ‘Of is 
a symplectic form on E. Furthermore, we need to 
introduce the following notions: 

The symplectic complement of a subspace Q C 三 
is the subspace 


Q = (£ e Ele(£|q) = 1Vq € Q} [25] 


Furthermore, a subspace Q of E is isotropic if it is 
contained in its symplectic complement Q7 D Q. In 
other words, for all pairs of vectors g,q' € Q we 
have o(q|q') — 1. 

We consider an isotropic subspace Q and we 
denote by (wlo, f |o, H) the corresponding restriction 
of the Weyl system (tw, f, H). Since Q is isotropic, it 
follows that the restriction flo is symmetric. Hence, 
the Weyl algebra for the restricted system 9o := 
9t(w|o, f |o, H) is an abelian subalgebra of B(H). As 
a consequence, all the operators in Ao can be 
diagonalized simultaneously. To obtain the joint 
spectral resolution for all operators in A, we employ 
some facts from the theory of finite dimensional 
abelian C*-algebras: 


1. 9o is a finite-dimensional abelian C*-algebra and 
can be identified with the algebra of complex 
functions C(Q^) on a finite set Q^. 

2. Each element c € Q^ is a character (pure state), 
that is, a linear functional such that 
w(AB) — w(A)m(B) and w(A*) = «(A). 

3. For each operator A € Ao there exists precisely 
one function fa on Q^ which is uniquely 
determined by w(A)= fa(w). The isomorphism 
A — fa is called the Gelfand isomorphism. 

4. A character c € Q^ is an irreducible representa- 
tion of Ago and there is a unique projection e; 
onto the subspace in H which carries this 
irreducible representation. 


From these facts we derive a joint spectral 
resolution for all operators in Ag. Namely, each 
A € Ao can be written as 


- 


A= » e, w(A) [26] 


wEQ^ 


We are now prepared to introduce the notion of 
stabilizer codes in accordance with Calderbank et al. 
(1997) and Gottesman (1996, 1997): Let Q be an 
isotropic subspace in E and let w € Q^ be a character 
of Ao. The projection e; is called a stabilizer code. 
The abelian group that is generated by the Weyl 
operators w(q),q € Q, is called stabilizer group. The 


abelian C'-algebra 20 is called stabilizer algebra. 
According to the following theorem, each stabilizer 
code is uniquely associated with a Clifford channel: 


Theorem 8 (Schlingemann 2002, 2004). Let Q be 
an isotropic subspace of € and let ej, be the 
stabilizer code of a character w. Then there exists 
a unique Clifford channel E, with input system 
(was fo, Hw) and output system (W|o°,f\97,H) such 
that tbe following is true: 


(i) For each € € E the identity 
E. (w(£)) = ôo (£)ws(£) [27] 


is fulfilled. 
(ii) Let vz:He—-H be the isometry which embeds 
Ha into H, then 


E4S(A) = v; Ati; [28] 


holds for all A € B(H). 
(iii) The channel E,, is invariant under translations 
in the isotropic subspace Q, that is, the identity 


Ex o Ad|w(q)] — Ez [29] 


holds for all q € Q. 


Stabilizer codes for maximally isotropic subspaces 
Q= Q are special, since the projection €» onto the 
eigenspace of the character w is one-dimensional. 
Thus, ew is the density matrix of a pure state which is 
called stabilizer state. In view of Theorem 8, the 
expectation value of a Weyl operators w(£) is given by 


tr(eziw(£)) = e(w(&))6o(£) [30] 


Representation by Graphs 


As described in the previous section (Theorem 8), 
each stabilizer codes is a pure Clifford channel 
which is completely determined by an isotropic 
subspace and a character of the corresponding 
stabilizer algebra. A constructive characterization 
of isotropic subspaces can be given in terms of 
graphs, as it has been shown in Schlingemann (2002, 
2004). The complete description of a stabilizer code 
requires in addition the choice of a character of the 
stabilizer algebra. Both data, the isotropic subspace 
and the character, can be encoded in a single graph 
A. The set of vertices N is partitioned into four 
different types, the input vertices I, the output 
vertices J, the measurement vertices K and the 
syndrome vertices L (see Figure 1). The edges of 
the graph are undirected, and a pair of vertices can 
be connected by at most d — 1 edges, where self- 
links are also allowed. The adjacency matrix (also 


(1,1)~iY 


(1,1) -iY 


(a) (b) 

Figure 1 (a) A graphical representation of a Weyl operator 
-YQ Y ®Z@2Z®1 ofthe stabilizer algebra of a quantum error 
correcting code, encoding one qubit into five (see 00273). The 
input vertex is gray, the output vertices are black. Each binary 
vector represents a Pauli matrix sitting at a tensor position of the 
output system. (b) The expectation values which are products 
over all edges, where to each edge with labels 9, 9' the value 
(- 1) is assigned. The character corresponds to the syndrome 
configuration (1110) (blanc vertices). 


(1,0)-Z 


denoted by A) is a symmetric matrix with entries 
Ay =0,1,... ,d—1 according to the number of 
edges between x and y. Thus, the adjacency matrix 
can be seen as a linear operator on F" with cyclic 
field F=Z 4. Each subset A C N corresponds to a 
linear projection onto the subspace F* c F", which 
we denote by ma. For a convenient description we 
introduce the following notation: the union of two 
sets of vertices is written without the symbol U, that 
is, instead of IU J we write IJ. 


Theorem 9 (Schlingemann 2002, 2004). Let Qc 
FY @F’ be an isotropic subspace and let w be a 
character of the stabilizer algebra Ug. Then there 
exists a graph A with input vertices I, output vertices 
J, measurement vertices K and syndrome vertices L 
such that the following holds: 


(i) The linear operator v y At x, is invertible. 
(ii) The isotropic subspace Q consists of the vectors 
(T5 À7 kq, Tq) with qE ker(a,A7 yx). 
(iii) There is a unique vector a in the syndrome 
subspace F™ such that the expectation values of 
the character w are given by 


w(w(riAmmq,m;q))—- CA 4*9 ^ — [31] 


with qe ker( 71, A7 jx). 


Theorems 8 and 9 provide different useful 
characterizations of stabilizer codes, namely in 
terms of eigenspaces, Clifford channels, and graphs. 


e The original definition of stabilizer codes in terms of 
eigenspaces goes back to Calderbank, Gottesman, 
Rains, Shor, and Sloane (see, e.g., Calderbank et al. 
(1997), Gottesman (1996, 1997). They have devel- 
oped an approach to derive quantum codes from 
classical binary codes. 
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e Stabilizer codes can also be characterized by 
specific Clifford channels (see Theorem 8). The 
condition for a channel to be a stabilizer code is 
the covariance with respect to a subgroup of 
phase space translations. This reflects stabilizer 
codes in terms of symmetries. 

e Theorem 9 yields a characterization of stabilizer 
codes in terms of graphs providing an explicit 
expression for the isotropic subspace and the 
character of the stabilizer code. This graphical 
representation provides a suggestive encoding of 
various properties like error-correcting capabil- 
ities, multipartite entanglement, the effects of 
specific local operations. In fact, as it has been 
shown in Briegel and Raussendorf (2001), Diir 
et al. (2003), and Hein et al. (2004) that the 
entanglement present in a graph state can be 
derived from its shape. 


See also: Capacities Enhanced by Entanglement; 
Quantum Error Correction and Fault Tolerance. 
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Introduction 


A typical problem of quantum statistical mechanics is 
to compute equilibrium states of quantum dynamical 
systems. However, there is a strange difficulty inherent 
in this task, which is to describe the solution: if we try 
to describe the quantum state by specifying all matrix 
elements of all local density operators, we have a job 
which grows exponentially with the system size. This 
approach is obviously out of the question for the large 
systems statistical mechanics is interested in. Luckily, 
in practice nobody really wants to see all those 
numbers anyway, and one is content with determining 
a few correlation functions, or other easily parame- 
trized characteristics of the state. But for computing a 
state in the first place, we cannot restrict the state 
description to a such parameters. So the problem there 
Is again: how can we efficiently parametrize the states 
of interest? 

In this article we collect some results on a 
particular way of addressing this problem. It 
originated in the early 1990s (Fannes et al. 1992b) 
in ideas for quantizing the notion of Markov chains 
(Accardi and Frigerio 1983). Recently, there has 
been a new surge of interest in such ideas, because 
they turned out to be very useful for numerical work 
on quantum spin chains. 

Its typical feature is that one does not directly 
describe expectation values of the state, but instead 
generates the state from a description of its correla- 
tions between neighboring sites. In the languagé of 
quantum information theory, it could be said that 
the method focuses on the entanglement between 
different parts of the system. 


The Basic Construction 
Notation 


We consider a quantum spin chain, that is, a system 
of infinitely subsystems, labeled by the integers, each 


Zmud EM (1971) Symplectic geometries over finite abelian 
groups. Mathematics of the USSR Sbornik 15: 7-29. 

Zmud E (1972) Symplectic geometry and projective representa- 
tions of finite abelian groups. Matbematics of tbe USSR 
Sbornik 16: 1-16. 


of which is a quantum-mechanical d-level system. 
Let us denote the observable algebra at site x € Z by 
Ax. Each .A, is hence isomorphic to the d x d 
matrices. The observables of the whole (infinite) 
system lie in the infinite tensor product 
Az = Qez Ax. This is defined as a quasilocal 
algebra (Bratteli and Robinson 1987, 1997), which 
is to say that it is the algebra generated by all finite 
tensor products of elements of the Ax, say Qpe Ax 
with A, € A, and A finite. Such an element is said 
to be localized in A, and we denote by A, the 
corresponding algebra. For A; C A», we identify .4A， 
with a subalgebra of Am, by tensoring with the 
identity operator on all sites in A2 V A4. Az, is the 
completion of the union of all Ay, with A finite, 
under the C*-norm. 

A state w on .Az is uniquely specified by its 
expectations on the subalgebras A,. Since these are 
finite-dimensional matrix algebras, we can write 
w(A)=tr(p,A) for A € Aa, with a “local density 
operator" p4. The system of local density operators 
must be consistent with respect to restrictions 
(partial traces). 

So far we have not used the structure of the 
underlying lattice Z in any way. This enters via the 
translation automorphisms 7, of Az, which identify 
A, with A,,,. A state is called translationally 
invariant, if wo 7, —w. The translationally invariant 
states form a weakly compact convex subset of the 
state space of Az, whose extreme points are called 
ergodic states. 


How to Generate Correlations 


Correlations between parts of a systems typically 
have their origin in an interaction in the past. Even 
if the subsystems are dynamically separated later on, 
the correlation persists, and one can take this as a 
motivation to model correlations from two ingredi- 
ents: a simplified prototype of a correlated system, 
and some evolution taking the parts of the simplified 
system to the parts of the given system. Let us 
consider a composite system, whose parts have 
observable algebras A; and A2, respectively, so 
that the whole system has algebra A; & A2. We can 
build a state w on this system from a simpler one, 
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say a state 7 on some B, & B5, and two completely 
positive unit preserving maps T;:.A; — B; such that 


w(A1 & A2) = n(Tı (A1) & T2(B2)) 


Some features of 7 are inherited by w. For example, 
when 7 is separable (a convex combination of 
products), which is always the case if either Bı or 
B; is classical (i.e., an abelian algebra), then the 
same holds for w. Hence, if we want to describe 
quantum correlated *entangled" states, we have to 
build the correlations on an entangled state 7. 
Similarly, the “size” of the model system B; & B; 
limits the strength of correlations in w. As for every 
correlated state, we can look at the linear func- 
tionals on A2, which are of the form A — w(A; & A) 
with fixed A; € A2. The dimension of the space of 
such functionals might be called the correlation 
dimension of w. This dimension is 1 for product 
states, and can clearly not increase by passing from 
7 to w. Hence, it is bounded by the dimensions of Bj 
and B5, even if A; and A are infinite dimensional. 
“Finite correlation" in the sense of the title of this 
article refers to the finiteness of the correlation 
dimension between the two halves of a spin chain. 


The VBS Construction, and Matrix Product States 


The so-called valence bond solid (VBS) states on a 
chain are constructed by applying these ideas to the 
correlations across every link of a spin chain. Let us 
introduce a correlated model state 74, on some 
algebra B, & B; for every bond (x,x-- 1). Then 
the state at site x is a function of contributions from 
both bonds connecting it, and we express this by a 
completely positive map Tx: Ax — B; , ® B,. Then 
an observable A; ®:::® Az on a chain piece of 
length L is first mapped by GE. T to an element 
of By; 8 Bi ®---®B;_, ® Bj. Evaluating with the 
states 71 ®© --- $ rr 1, we are left with an element of 
Bj ® Bj, which we can evaluate with yet another 
state 7j, describing the boundary conditions for the 
construction (see Figure 1). 

Clearly, if we take the algebras B= large enough, 
and the model states 7, sufficiently highly entangled, 
we can generate every state on the finite chain. 
However, we can get an interesting class of states, 
even for fixed finite dimensions of the B>. By 
restricting this correlation dimension, we can set a 
level of complexity for the state description. We can 
then try to handle a given physical problem first 
with simple states of low correlation dimension, and 
increase this parameter only as needed. A typical 
problem here is to determine the ground state of a 
finite-range Hamiltonian. We can then optimize 
each T, and 7, separately, minimizing the ground 
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Figure 1 


state energy with all other elements fixed. This is a 
semidefinite programming problem, for which very 
efficient methods are known. The global minimiza- 
tion is then done by letting the optimization site x 
sweep over the whole chain as often as needed. 

In a ground-state problem one is looking for a 
pure state, and it is therefore sufficient to choose 
both the model states 7, and the operations T, to 
pure, that is, without decomposition into sums of 
similar objects. The scheme is thus run at the vector 
level rather than the operator level: we take the 
algebras B; = B, as the operators on a Hilbert space 
Ky, and 5, —(dim Ki) !|Q,)(Q,| with the (unnor- 
malized) maximally entangled vector 


Q9. = » pel E Kr ® Kx [1] 


/ 


The maps T, will be implemented by a single 
operator Vy:Kxy1@K,y—-H as TX(A) - VV AV. 
Then the vectors V € H® contributing to the state 
on the chain of length L are of the form 


y = V1 &--- Vr (ljo) & Qt @ liz) 
= b» (V1 @ --- Villio fi /t-- 11:72) 


301115: JL 


where jo,j; are labels for bases in Ko and Kj, 
describing the possible choices at the boundary, and 
we have used the special form of 2. We write out 
the operators V. in components, so that 


Veli’) = M Iu) Ves 
" 


with suitable dim K,. x dim X, dimensional 
matrices V^, in terms of which the above expression 
can be interpreted as a matrix product. The 
components of V in a product basis (|u)) become 


uL[V) = Gol Vi Vr Vr) — Pj 


Due to this form the states generated in this way 
have also been called “matrix product states" 
(Klümper 1991). If one wants to consider periodic 


(hs 
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boundary conditions, the indices jo and jr can also 
be contracted, and the expression becomes a trace. 
For some simulations it is also convenient to choose 
^ Ko- dim K,=1, so there is only one matrix 
element to be considered. 

The scheme for getting ground-state vectors 
described here is essentially the same as the density 
matrix renormalization group method (Verstraete 
et al. 2004). However, the version given here 
appears to be more transparent, more flexible, and 
in some cases (e.g., periodic boundary conditions) 
vastly more efficient. However, it may be too early 
for such judgment, since this is very much work in 
progress (Verstraete et al. 2005). 

In the sequel, we will focus not so much on the 
numerical aspects, but on the possibility this 
construction offers to explicitly construct nontrivial 
translationally invariant states on the infinite chain. 
Numerically, even in a translation invariant situa- 
tion the matrices V/ obtained by optimization may 
turn out to depend on x (Wolf, Private Commu- 
nication), that is, one has to admit the possibility of 
a spontaneous symmetry breaking. However, for the 
construction of states on the infinite chain we will 
simply fix all V. to be equal. In some sense this 
turns the matrix product into a matrix power, which 
could be analyzed by methods familiar from the 
transfer matrix formalism of statistical mechanics. 
In eqn [2] this does not work, because of the 
u-dependence of the matrices involved. Neverthe- 
less, a slight reorganization of the construction will 
lead to a transfer-matrix-like formalism. 


The Evolution Operator Construction 


Fixing all T, to be the same in Figure 1 still does not 
fix the state uniquely, since both in the mixed state 
version and in the pure state version of the 
construction some boundary information enters, as 
well. This boundary information then has to be 
chosen in such a way that a consistent family of 
local density operators 1s generated. It turns out that 
by rearranging the construction a little bit one can 
trivially solve one boundary condition, and reduce 
the other to finding a fixed point of a linear 
operator. This rearrangement was first carried out 
in Fannes et al. (1992b), where the term “finitely 
correlated state" was also coined. 

The basic element of the VBS construction was 
the operators T:.A— B* & B^ (here already taken 
independent of x). This is specified by dim A- 
dim B^ -dim B^ matrix elements. However, assum- 
ing we can identify the algebras B*, we can also 
consider these matrix elements as those of an 
“evolution operator" E:.A 6» B — B. This operator 


is once again taken to be completely positive and 
unit preserving. We introduce its mth iterate 
E” . A®™ @ B — B by the recursion 


EU)-E,  E""'-RE(üd(j9E") [3 


Clearly, these operators are again completely posi- 
tive and unit preserving. Another way to express this 
iteration is to look at E as a family of maps on B, 
parametrized by A € A: We set EA(B) — E(A & B), 
and find 


E™(A, @---@A,@B)=E,,---E,,(B) [4] 


An important special role is played by the operator 
E= E, which is again completely positive and unit 
preserving. 

Now given any state 7 on B, we get a state wn on 
A", by setting 


wy(Ay @--- Ay) = n( EO (Ay @Q---@A,® 1)) [5] 


Since E(1)— 1, this family of states is consistent 
with respect to increasing n, by adding sites on the 
right, that is, &441(A & 1)=w,(A). In other words, 
the family w, defines a state on the infinite right 
half-chain. This state can be extended to the full 
chain, as a translationally invariant state if and 
only if consistency also holds for adding sites 
on the left, that is, if wi(l ® A)=w(A) for all 
A € A". For this we need a condition on the state 
7: it must be invariant under the map E (i.e., 
n(E(B))=7(B) for all B €B). This is the only 
requirement, and we call w the state Az generated 
by E and 7. Note that since E has the invariant 
vector l, its transpose also has an invariant vector, 
which can also be chosen as a state. We will often 
look at unique invariant state, in which case we can 
call w the state generated by E, without having to 
mention 7). 

The valence bond picture was very much sug- 
gested by trying to describe correlations in a 
spatially distributed quantum system (the chain). 
The construction given here is perhaps more readily 
suggested by a process in time, rather than space. In 
fact, the paper by Fannes et al. (1992b) was partly 
motivated by an attempt to define a quantum analog 
of Markov processes (Accardi and Frigerio 1983). In 
fact, we can think of the construction as a general 
form for a repeated measurement in quantum 
theory. The object on which the measurements are 
performed has observable algebra B, whereas 
A describes the successive outputs. Choosing .A to 
be classical (abelian) we would find in w the joint 
probability distribution of the sequence of measured 
values, when the initial state of the object is 7 (not 
necessarily invariant). Allowing nonabelian .A would 


then correspond to a family of delayed choice 
experiments: while E describes the interaction of 
the system with the measurement apparatus (includ- 
ing the overall state change E), we are still free to 
make correlated and even entangled measurements 
on the successive output systems. This interpretation 
suggests many extensions, in particular, to continu- 
ous time (where the case of abelian outputs is 
discussed extensively in the classic book by Davies 
1976), or to cases allowing an external quantum 
input in each step, in which case we are looking at a 
quantum channel with memory B (Kretschmann and 
Werner 2005). 

In spite of the different natural interpretations, 
however, the constructions in this and previous 
paragraphs give exactly the same class of transla- 


tionally invariant states on the chain, as was shown 
in (Fannes et al. 1992b). 


Ergodic Decomposition 


A state on Ay is called ergodic if it is an extreme 
point of the compact convex subset of translation- 
ally invariant states. Often in statistical mechanics, 
one finds states which may be ergodic, but never- 
theless contain a breaking of translation symmetry. 
Such states can be decomposed into periodic states, 
that is, states which are invariant with respect 
to some power of the shift. In general, new 
decompositions may become possible for any 
period. If no decomposition into periodic states is 
possible, the state is called completely ergodic. 

In this section we consider the question of how to 
decompose a finitely correlated state into ergodic 
components, using a well-established connection 
between ergodicity and clustering properties 
(Bratteli and Robinson 1987, 1997), that is, the 
decay of correlation functions. 

Correlation functions are very easily evaluated for 
finitely correlated states: let As be two observables 
localized on n sites, and suppose that these sites are 
separated by L sites. Then eqn [5] gives 


w(A_@ 1%" @A,) = s(EZ? BL EM) (1)) 16] 


The L-dependence of this operator is clearly 
governed by the matrix powers of E. By assumption 
this operator always has the eigenvalue 1, because 
E(1)=1, and has norm <1, because it is also 
completely positive. The spectrum is hence 
contained in the unit circle. Each eigenvalue with 
modulus «1 thus contributes exponentially decay- 
ing terms to the correlation function [6]. From 
eigenvalues of modulus 1, which make up the 
so-called peripheral spectrum, we may get constant 
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or periodic contributions. This distinction is directly 
reflected in the ergodic properties (Fannes et al. 
1992b): 


e When the eigenvalue 1 is simple, there is a unique 
invariant state 7, and lim,” ! Y s E*(B)— 
n(B)l. This implies, by [6] and (Bratteli and 
Robinson 1987, 1997, theorem 4.3.22), that w is 
ergodic. 

e When the eigenvalue 1 is simple, the peripheral 
spectrum consists precisely of the pth roots of 
unity for some p> 1. The state w is then the 
equal-weight convex combination of p periodic 
states with period p, which are translates of each 
other. 

e [n particular (i.e., for p — 1), a peripheral spec- 
trum consisting only of the simple eigenvalue 1 
implies that w is exponentially clustering in the 
sense that 


w(A- & 19  & A,) — (A )u(A4)] 
< poly(L)r | A ||. [7 


where r is the largest modulus of eigenvalues other 
than 1, and poly is polynomial obtained from the 
Jordan normal form of E. By the previous item, the 
state w is then completely ergodic. 

e Conversely, if a state is finitely correlated, and is 
ergodic (resp. completely ergodic), it has a 
representation such that 1 is a simple eigenvalue 
(resp. the peripheral spectrum is trivial). 


Purity 
Pure States 


As in the case of the VBS construction, there is a 
version of the evolution operator construction, 
which is especially suited to produce pure states. 
Pure states are those which cannot be decomposed 
into a weighted sum of other states. For a 
translationally invariant state, this is a much 
stronger property than ergodicity and even com- 
plete ergodicity: not only the decomposition into 
periodic states is impossible, but any decomposition 
whatsoever. Nevertheless, this is what one expects 
from a ground state of translationally invariant 
interaction. 

From the formula [5] it is clear that if we 
decompose the E-operator entering for a site x into 
a sum two completely positive terms, we will have 
decomposed w into two positive terms. These might 
still be equal, but it is certainly suggestive to look at 
states generated with an E, which cannot be 
decomposed nontrivially into a sum of other 
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completely positive maps. Such maps are called 
pure, and are characterized by the form 


E(A $ B) = V*(A & B)V sj 
V:C* C4 @C* is isometric 


and A and B are the algebras of d x d and k x k 
matrices, respectively. Finitely correlated states 
generated from such a pure evolution operator are 
called purely generated. These are the candidates for 
pure finitely generated states. 

The form of a pure map is reminiscent of the 
Stinespring dilation of a general completely positive 
map: for a general E, we can set 


E(A & B) = V (1,8 A& BV [9] 


where .A is some auxiliary matrix algebra. Since the 
invariance condition for 7 does not involve the A 
algebras, we get a purely generated state w with one- 
site algebra (A&A), whose restriction to the 
original chain is w. Hence, purely generated states 
are the prototypes from which all other finitely 
correlated are obtained by sitewise restriction. 

But are such states pure? Since E need not have a 
trivial peripheral spectrum, the previous section tells us 
that a purely generated state may have a nontrivial 
decomposition into other, perhaps periodic states. But 
this is the only restriction we have to make. Indeed, the 
following statements about a finitely correlated state w 
are equivalent (Fannes et al. 1994, theorem 1.5): 


* ( is pure; " 

e w is purely generated, and the operator E has 
trivial peripheral spectrum; 

e the mean entropy of w vanishes, and w is 
clustering, that is, [7] holds; and 

e E has the form [8], and no subalgebra of B, which 
contains l, is invariant under all operators Ea. 


The Asymptotic Form of the Local Support 


Let us now fix an isometry V, such that E has trivial 
peripheral spectrum, and let p denote the unique 
invariant state of E. Then the vectors V € ^?" in 
the support of w are of the form [2] and depend, 
apart from the fixed choice of the V = V^, on the 
boundary indices jo,/, — 1,..., k. We can consider 
this as a map DL, from k x k matrices to H”: 


(1, n | FS (B))  tr(BV^ V2 ...V») — [10] 


and denote the range of 工 , by Ga. Then GC, is at most 
k?-dimensional. Moreover, this family of subspaces 
is nested, that is, Gaim C 0, & H^" and Guim C 
H?” Q Gm. Using that E(B)'— p(B)l converges 
exponentially fast, we also find that I',, is asympto- 


tically an isometry between G,, and the Hilbert 


space of kxk matrices with scalar product 
(B, C), =tr(pB*C). Hence, all the spaces C, are 
asymptotically identified, even though they are 
contained in each other. This “self-similarity” is 
the source of many further properties. For example, 
for any density matrix p on £ + m + r sites supported 
by Geim+r, and any observable A, localized on m 
sites in the middle of this interval (with / to the 
left and r to the right), we get the expectation 
tr(pA) ~w(A), up to exponentially small terms 
depending only on / and r. 


Ground States and Gaps 


Suppose we fix some interval length Z, and let h be 
the projection onto the complement of G; in Tt2 
We now consider / as the interaction term of a 
lattice interaction, that is, we consider the formal 
Hamiltonian 


H= >》 mlh) [11] 


Then in the finitely correlated state w, each term in 
this sum has expectation zero, which is the absolute 
minimum for such expectations, because bh > 0. In 
this sense w is the ground state for this Hamiltonian. 
Usually, ground states are not characterized in this 
way: one can only require that the average energy is 
minimized with respect to all translationally invari- 
ant states (Bratteli and Robinson 1987, 1997, 
theorem 6.2.58). Hence, one can usually perturb a 
ground state locally such that some terms in [11] 
have less than average expectation, at the expense of 
others. For w this is clearly impossible. Moreover, 
any state w with w'(Tx(h))=0 for all x must coincide 
with w, even if we do not impose translation 
invariance. This follows from the previous section: 
the local density operators of w’ must all be 
supported in G, by the nesting property; hence, if 
we compare density operators on intervals of length 
/ -- m +r on observables localized on the middle m 
sites, we get w (A) + w(A), up to errors exponentially 
small in / and r. 

The Hamiltonian [11] involves an infinite sum, 
which can be mathematically understood as a 
quadratic form in the GNS-representation associated 
with w (Bratteli and Robinson 1987, 1997). This is 
the Hilbert space spanned by vectors written as AQ, 
with the scalar products (AQ, Bw)=w(A*B), for 
local operators A, B. The ground-state property then 
implies HQ — 0, and H > 0, because b > 0. It can be 
seen that H generates a well-defined dynamics, and 
is essentially self-adjoint on the domain of such 
vectors. Thus also the spectrum of H is a well- 
defined concept. This suggests a strengthening of the 


ground-state property: not only is Q the unique 
eigenvector of H for eigenvalue 0, but there is a gap 
^ > 0 between zero and the next eigenvalue. This 
property is of considerable interest for models in 
solid-state physics and statistical mechanics. It was 
shown for all ergodic pure finitely correlated states 
in (Fannes et al. 1992b). 


Density 
Density of Finitely Correlated Pure States 


The natural topology in which to consider the 
approximation between states on the chain is the 
weak topology. A sequence w, converges weakly to 
w if for all local A the expectations converge, that is, 
wy, (A) — w(A). 

Let us start from an arbitrary translationally 
invariant state w, and see how we can approximate 
it. First, we can split the chains into intervals of 
length L, and replace w by the tensor product of the 
restrictions of w to each of these intervals. This state 
is not translationally invariant, so we average it over 
the L translations, and call the resulting state wz. 
Consider a local observable A, whose localization 
region has length R. Then for L — R out of the L 
translates contributing to wz, the expectation will be 
the same as for w, and we get 


wr (A) = (1 - Leu 2 UAM, 
(a b 

where the error term à is again a state. Hence, wr 
converges weakly to w as L — oc. One can show 
easily that wy, is finitely correlated, with an algebra 
B essentially equal to A®”. Hence, the finitely 
correlated states are weakly dense in the set of 
translationally invariant states. 

We can make the approximating states purer by a 
very simple trick. In the previous construction we 
always take two intervals together, and replace the 
tensor product of the two restrictions by a purifica- 
tion, that is, by a pure state on an interval of length 
2L, whose restrictions to the two length-L subinter- 
vals coincide with w. We average this over 2L 
translates, and call the result nr. The estimates 
showing that nr — w weakly are exactly the same as 
before. Moreover, one can show (Fannes et al. 
19922) that nr is purely generated. 

Being defined as a convex combination of other 
states, nL is not pure, and the peripheral spectrum of 
E will contain all the 2Lth roots of unity. However, 
we can use that such a rich peripheral spectrum is 
not generic for E constructed from an isometry V. 
Therefore, if we choose an isometry V. close to the 
isometry V generating 5r, we obtain a purely 
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generated state 7; with trivial peripheral spectrum. 
Since the expression for expectations of such states 
depends continuously on the generating isometry, 
we have that 7; — 7, as ¢— 0. But we know from 
the previous section that such states are pure. 
Hence, the pure finitely correlated states are weakly 
dense in the set of all translationally invariant states 
(Fannes et al. 1992a). 

This has implications for the geometry of the 
compact convex set of translationally invariant 
states, which are rather counter-intuitive for the 
intuitions trained on finite-dimensional convex 
bodies. To begin with, the extreme points (the 
ergodic states) are dense in the whole body. This is 
not such a rare occurrence in infinite-dimensional 
convex sets, and is shared, for example, by the set of 
operators F with 0 € F € 1 on an infinite-dimen- 
sional Hilbert space (Davies 1976). Together with 
the property that the translationally invariant states 
form a simplex, it actually fixes the structure of this 
compact convex set to be the so-called Poulsen 
simplex. This was known also without looking at 
finitely correlated states. The rather surprising result 
of the above density argument is that even the small 
subclass of states which are extremal, not only in the 
translationally invariant subset but even in the 
whole state space, is still dense. 


Finitely Correlated Pure States with 
Bounded Memory Dimension 


It is clear in the above construction that the dimension 
of the algebra B goes to infinity for an approximating 
sequence. How many states can we get with a fixed 
memory algebra B? The dimension of this manifold 
can be estimated easily from the number of parameters 
needed to describe the map E, and this dimension is 
certainly small compared to the dimension of the state 
space of the length L piece of the chain as L — oc. 
However, since this is an infinite set, and not a linear 
subspace, we do not get an immediate bound on the 
dimension of the linear span of these states. What we 
want to show in this section is that the space of finitely 
correlated states with fixed B nevertheless generates a 
low-dimensional subspace of states on any large 
interval of the chain. To this end we will have to 
exhibit many observables A, localized on L sites, 
whose expectation is the same for all finitely correlated 
states with given 5. 

Let us look first at the case of purely generated 
states, or rather at the vectors V € 74? , which can 
be written in the form [2], which in the translation 
invariant case becomes 


(pirsa apr (E) = (o|V^ V^-..V^gp) — [12] 
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for some collection V!,..., V4 of k x k matrices, and 
some basis labels jo,/; € {1,...,k}. The span of all 
such vectors will be denoted by Y; (k, d), and we would 
like to analyze the growth of dim V; (b, d), as L — oo. 
Now a vector with components a( p4, . .. , uL) lies in the 
orthogonal complement of Y; (k, d) if and only if 


S Al fits « +: 


Hisn HL 


, H) V” VE... VEL = 0) 


for any collection of matrices V”. In other words, this 
expression, considered as a noncommutative polyno- 
mial in d variables, is a polynomial identity for k x k 
matrices. The simplest such identity, for k=2, 
d —3,L — 5, is [A, [B, C^] =0. (For the proof observe 
that [B, C] is traceless, so its square is a multiple of the 
identity by the Cayley-Hamilton theorem.) This 
identity alone implies the existence of many more 
identities. For example, we can substitute higher-order 
polynomials for A, B, C, and multiply the identity with 
arbitrary polynomial from the right or form the left. 
There is a well-developed theory for such identities, 
called the theory of polynomial identity (PI) rings. In 
that context, the precise growth we are looking for has 
been worked out (Drensky 1998): 


log dim Vi (k,d) 


lim Pi 


L—oo 


—(d —1)&? +1 [13] 


Thus, the dim Y; (b, d) only grows like a polynomial 
in L, of known degree, and the joint support of all 
purely generated finitely correlated state is exponen- 
tially small compared to ?1?^. 

We can apply the same idea to the set of all finitely 
correlated states with B equal to the k x k matrices. 
The joint support in this case is the full space, since the 
trace state on the chain, which is a product state 
generated with k = 1, already has full support. How- 
ever, it is still true all but a polynomial number of 
expectation values of w are already fixed by specifying 
k. Indeed, formula [5] for a general state is precisely of 
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Introduction 


Knots belong to sailors and climbers and upon further 
reflection, perhaps also to geometers, topologists, or 
combinatorialists. Surprisingly, throughout the 1980s, 


the form [12], with the difference that the arguments A 
replace p, and the matrices Ea are now operators on 
the k*-dimensional space B. If we only want an upper 
bound, we can ignore subtlatties coming from Hermi- 
ticity and normalization constraints on E, and we get 
that the dimension of all finitely correlated states 
generated from the k x k matrices, restricted to a 
subchain of length L, grows at most like L^, with 
a € (d^ —1)k? +1. 


See also: Ergodic Theory; Quantum Spin Systems; 
Quantum Statistical Mechanics: Overview. 
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it became apparent that knots are also closely related 
to several other branches of mathematics in general 
and mathematical physics in particular. Many of these 
connections (though not all!) factor through the 
notion of “finite-type invariants” (aka “Vassiliev” or 
“Goussarov—Vassiliev” invariants) (Goussarov 1991, 
1993, Vassiliev 1990, 1992, Birman-Lin 1993, 
Kontsevich 1993, Bar-Natan 1995). 

Let V be an arbitrary invariant of oriented knots in 
oriented space with values in some abelian group A. 
Extend V to be an invariant of 1-singular knots, knots 


that may have a single singularity that locally looks like 
a double point X, using the formula 


VOUS = VEX) YOY [1] 


Further extend V to the set K” of m-singular 
knots (knots with m double points) by repeatedly 
using [1]. 


Definition 1 We say that V is of type m if its 
extension V|,-.: to (m + 1)-singular knots vanishes 
identically. We also say that V is of finite type if it is 
of type m for some m. 


Repeated differences are similar to repeated deriva- 
tives; hence, it is fair to think of the definition of V|.-» 
as repeated differentiation. With this in mind, the 
above definition imitates the definition of polynomials 
of degree m. Hence, finite-type invariants can be 
considered as *polynomials" on the space of knots. 

As described in the section “Basic facts”, finite-type 
invariants are plenty and powerful and they carry a 
rich algebraic structure and are deeply related to Lie 
algebras. There are several constructions for a 
“universal finite-type invariant” and those are related 
to conformal field theory, the Chern—Simons—Witten 
topological quantum field theory, and Drinfel’d’s 
theory of associators and quasi-Hopf algebras (see 
the section “The proofs of the fundamental theo- 
rem"). Finite-type invariants have been studied 
extensively (see the section “Some further directions”) 
and generalized in several directions (see the section 
“Beyond knots”). But the first question on finite-type 
invariants remains unanswered: 


Problem 2 Honest polynomials are dense in the 
space of functions. Are finite-type invariants dense 
within the space of all knot invariants? Do they 
separate knots? 


In a similar way, one may define finite-type 
invariants of framed knots (and ask the same 
questions). 


Basic Facts 
Classical Knot Polynomials 


The first (nontrivial!) thing to notice is that there are 
plenty of finite-type invariants and they are at least 
as powerful as all the standard knot polynomials 
combined (finite-type invariants are like polynomials 
on the space of knots; the standard phrase “knot 
polynomials” refers to a different thing — knot 
invariants with polynomial values): 


Theorem 3 (Bar-Natan 1995, Birman-Lin 1993). 
Let ](K)(q) be the Jones polynomial of a knot K (it is 
a Laurent polynomial in a variable q). Consider the 
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power series expansion ](K)(e*)= 357 9 Vin(K)x”. 
Then each coefficient Vn(K) is a finite-type knot 
invariant (thus, the Jones polynomial can be 
reconstructed from finite-type information). 


A similar theorem holds for the Alexander- 
Conway, HOMFLY-PT, and Kauffman polynomials 
(Bar-Natan 1995), and indeed, for arbitrary Reshe- 
tikhin—Turaev invariants (Reshetikhin and Turaev 
1990, Lin 1991), although it is still unknown if the 
signature of a knot can be expressed in terms of its 
finite-type invariants. 


Chord Diagrams and the Fundamental Theorem 


The top derivatives of a multivariable polynomial 
form a system of constants which determine 
the polynomial up to polynomials of lower 
degree. Likewise the mth derivative V™ := 
V(X .7.5X) of a type m invariant V is a constant (for 
VOR I. OX) - WX RR MAN) =0 
so V™ is blind to 3D topology), and likewise V"? 
determines V up to invariants of lower type. Hence, a 
primary tool in the study of finite-type invariants is the 
study of the “top derivative" V". also known as “the 
weight system of V." 

Blind to 3D topology, V! only sees the combi- 
natorics of the circle that parametrizes an m-singular 
knot. On this circle, there are » pairs of points that 
are pairwise identified in the image; one indicates 
those by drawing a circle with m chords marked (an 
“m-chord diagram") (see Figure 1). 


Definition 4 Let D, denote the space of all formal 
linear combinations with rational coefficients of 
m-chord diagrams. Let A’, be the quotient of D, 
by all 4T and FI relations as drawn in Figure 2 (full 
details are given in, e.g., Bar-Natan (1995)), and let 
A’ be the graded completion of A := €D,, An. Let 
Am,A, and A be the same as ÆA „A, and A’ 


but without imposing the FI relations. 


Figure 1 A 4-singular knot and its corresponding chord diagram. 


ATi Fir ied J be | 
bo. irr T Sirang 
Figure 2 The 4T and FI relations. 
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Theorem 5 (The fundamental theorem) 


e (Easy part). If V is a rational valued type 
m-invariant then V™ defines a linear functional on 
A’. If in addition V = 0, then V is of type m — 1. 

e (Hard part). For any linear functional W on A’, 
there is a rational valued type m invariant V so 
that V" = W. 


Thus, to a large extent, the study of finite-type 
invariants is reduced to the finite (though super- 
exponential in m) algebraic study of Aj. A similar 
theorem reduces the study of finite-type invariants 
of framed knots to the study of A,m. 


The Structure of A 


Knots can be multiplied (the “connected sum” opera- 
tion) and knot invariants can be multiplied. This 
structure interacts well with finite-type invariants and 
induces the following structure on A’ and A: 


Theorem 6 (Kontsevich 1993, Bar-Natan 1995, 
Willerton 1996, Chmutov et al. 1994). A” and A are 
commutative and cocommutative graded bialgebras 
(i.e. each carries a commutative product and a 
compatible cocommutative coproduct). Thus, both 
A’ and A are graded polynomial algebras over their 
spaces of primitives, P' = Ọm P^, and P= Bm Pm- 


Framed knots differ from knots only by a single 
integer parameter (the “self-linking,” itself a type 1 
invariant). Thus, P” and P are also closely related. 


Theorem 7 (Bar-Natan 1995). P=P" & (0), where 
0 is the unique 1-chord diagram: 


Bounds and Computational Results 


Table 1 shows the number of type m-invariants of 
knots and framed knots modulo type m 1 invar- 
iants (dim A}, and dim A,,) and the number of 
multiplicative generators of the algebra .A in degree 
m (dim Pm) for m < 12. Some further tabulated 
results are in Bar-Natan (1996). 


Table 1 Some dimensions of spaces of finite type invariants 


m 0 1 2 3 4 5 
dim A‘, 1 0 1 1 3 4 
dim Am 1 1 2 3 6 10 
dim Pm 0 1 1 1 2 3 


Source: Bar-Natan (1995); Kneissler (1997). 


Little is known about these dimensions for large m. 
There is an explicit conjecture in Broadhurst (1997), 
but no progress has been made in the direction of 
proving or disproving it. The best asymptotic bounds 
available are the following. 


Theorem 8 For large m, dim P, > e^" (for any 
fixed c«m4/2/3) and dim Am < 6" ml m/s?" 
(Stoimenow 1998, Zagier 2001). 


Jacobi Diagrams and the Relation 
with Lie Algebras 


Much of the richness of finite-type invariants stems 
from their relationship with Lie algebras. Theorem 9 
below suggests this relationship on an abstract level, 
Theorem 10 makes that relationship concrete, and 
Theorem 12 makes it a bit deeper. 


Theorem 9 (Bar-Natan 1995). The algebra A is 
isomorphic to the algebra A' generated by “Jacobi 
diagrams in a circle" (chord diagrams tbat are also 
allowed to bave oriented internal trivalent vertices) 
modulo tbe AS, STU, and IHX relations (see Figure 3). 


Thinking of trivalent vertices as graphical analogs 
of the Lie bracket, the AS relation becomes the 
anti-commutativity of the bracket, STU becomes 
the equation [x,y] 2? xy — yx, and IHX becomes the 
Jacobi identity. This analogy is made concrete 
within the proof of the following: 


Theorem 10 (Bar-Natan 1995). Given a finite- 


dimensional metrized Lie algebra q (e.g., any semi- 
simple Lie algebra), there is a map T ,:.A— Ula)’ 
defined on A and taking values in the invariant part 
U(a)* of the universal enveloping algebra U(q) of a. 
Given also a finite-dimensional representation R of q 
there is a linear functional W, rp: AQ. 


IHX relations. 
6 7 8 9 10 11 12 
9 14 27 44 80 132 232 
19 33 60 104 184 316 548 
5 8 12 18 27 39 55 


Figure 4 A free Jacobi diagram. 


The last assertion along with Theorem 5 show 
that associated with any q, R, and m, there is a 
weight system and hence a knot invariant. Thus, 
knots are unexpectedly linked with Lie algebras. 

The hope (Bar-Natan 1995) that all finite-type 
invariants arise in this way was dashed by Vogel 
(1997, 1999) and Lieberum (1999). But finite-type 
invariants that do not arise in this way remain rare 
and not well understood. 

The Poincaré-Birkhoff-Witt (PBW) theorem of 
the theory of Lie algebras says that the obvious 
“symmetrization” map Xq:S(g) — U(q) from the 
symmetric algebra S(q) of a Lie algebra q to its 
universal enveloping algebra U(q) is a aq-module 
isomorphism. The following definition and theorem 
form a diagrammatic counterpart of this theorem: 


Definition 11 Let B be the space of formal linear 
combinations of “free Jacobi diagrams" (Jacobi 
diagrams as before, but with unmarked univalent 
ends (*legs") replacing the circle; see an example in 
Figure 4), modulo the AS and IHX relations of before. 
Let y : B — A be the symmetrization map which maps 
a k-legged free Jacobi diagram to the average of the k! 
ways of planting these legs along a circle. 


Theorem 12 (Diagrammatic PBW; Kontsevich 
1993, Bar-Natan 1995). x is an isomorphism of 
vector spaces. Furthermore, fixing a metrized q there 
is a commutative square as in Figure 5. 


Note that B can be graded (by half the number of 
vertices in a Jacobi diagram) and that x respects 


Ta Ta 


519) —— —— —5-- tt) 


Figure 5 The diagrammatic PBW isomorphism and its 
classical counterpart. 
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degrees so it extends to an isomorphism x:B— A 
of graded completions. 


Proofs of the Fundamental Theorem 


The heart of all known proofs of Theorem 5 is 
always a construction of a “universal finite-type 
invariant" (see below); it is simple to show that the 
existence of a universal finite-type invariant is 
equivalent to Theorem 5. 


Definition 13 A universal finite-type invariant is a 
map Z:{knots}—+.A’ whose extension to singular 
knots satisfies Z(K) — D + (higher degrees) when- 
ever a singular knot K and a chord diagram D are 
related as discussed before. 


The Kontsevich Integral 


The first construction of a universal finite-type 
invariant was given by Kontsevich (1993) (see also 
Bar-Natan (1995) and Chmutov and Duzhin 
(2001)). It is known as *the Kontsevich integral" 
and up to a normalization factor it is given by 


S = 1 EP, K dz; — dz; 


ti < 


P={(zj.2,)} 


where the relationship between the knot K, the pairing 
P, the real variables t;, the complex variables z; and 2}, 
and the chord diagram Dp is summarized in Figure 6 
(the symbol W^ means “sum over all discrete variables 
and integrate over all continuous variables.") 

The Kontsevich integral arises from studying the 
holonomy of the Knizhnik-Zamolodchikov equation 
of conformal field theory (Knizhnik and Zamolodchi- 
kov 1984). When evaluating Zi, one encounters 
multiple ¢-numbers (Le-Murakami 1995) in a sub- 
stantial way, and the proof that the end result is 
rational is quite involved (Le-Murakami 1996) and 
relies on deep results about associators and quasitrian- 
gular quasi-Hopf algebras (Drinfel'd 1990, 1991). 
Employing the same techniques, in Le-Murakami 


Figure 6 The key ingredients of the Kontsevich integral. 
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(1996), it is also shown that the composition of Wg, R o 
Z, precisely reproduces the Reshetikhin—Turaev invar- 
iants (Reshetikhin and Turaev 1990). 


Perturbative Chern-Simons-Witten Theory 
and Configuration Space Integrals 


Historically, the first approach to the construction 
of a universal finite-type invariant was to use 
perturbation theory with the Chern-Simons- Witten 
topological quantum field theory; this is also how 
the relationship with Lie algebras first arose. But 
taming the integrals involved turned out to be 
difficult and working constructions using this 
approach appeared only a bit later. 

In short, one writes a perturbative expansion for 
the large k asymptotics of the Chern-Simons- Witten 
path integral for some metrized Lie algebra g with a 
Wilson loop in some representation R of q, 


DA tra holy (A) 
q-connections 
x ev. | (Anda [5H ^A ^A) 
4m R? 3 


The result is of the form 
Waye WED) 


where €(D) is a very messy integral expression and 
the diagrams D as well as the weights Wap), g were 
already discussed before. Replacing Wap),r by 
simply D in the above formula, we get an expression 
with values in A: 


Z2(K):= MD Y eo» eA 
D 


For formal reasons Z2(K) ought to be a universal 
finite-type invariant, and after much work taming 
the €(D) factors and after multiplying by a further 
framing-dependent renormalization term Zanomaly， 
the result is indeed a universal finite-type invariant. 

Upon further inspection, the €(D) factors can be 
reinterpreted as integrals of certain spherical volume 
forms on certain (compactified) configuration spaces 
(Bott-Taubes 1994). These integrals can be further 
interpreted as counting certain “tinker toy construc- 
tions" built on top of K (Thurston 1995). The latter 
viewpoint makes the construction of Z visually 
appealing (Bar-Natan 2000), but there is no satis- 
factory write-up of this perspective yet. 

We note that the precise form of the renormaliza- 
tion term Zanomay remains an open problem. An 
appealing conjecture is that Zanomay — exp (1/2)=. If 


D: Feynman diagram 


this is true then Z3 —Z4 (Poirier 1999); but the 
conjecture is only verified up to degree 6 (Lescop 
2001) (there is also an unconfirmed verification to all 
orders (Yang 1997)). 

The most important open problem about pertur- 
bative Chern-Simons-Witten theory is not directly 
about finite-type invariants, but it is nevertheless 
worthwhile to recall it here: 


Problem 14 Does the perturbative expansion of the 
Cbern-Simons- Witten theory converge (or is asymp- 
totic to) tbe exact solution due to Witten (1989) and 
Reshetikhin and Turaev (1990) when the parameter 
k converges to infinity? 


Associators and Trivalent Graphs 


There is also an entirely algebraic approach for the 
construction of a universal finite-type invariant Z3. 
The idea is to find some algebraic context within which 
knot theory is finitely presented — that is, presented by 
finitely many generators subject to finitely many 
relations. If the algebraic context at hand is compatible 
with the definitions of finite-type invariants and of 
chord diagrams, one may hope to define Z3 by defining 
it on the generators in such a way that the relations are 
satisfied. Thus, the problem of defining Z3 is reduced 
to finding finitely many elements of A-like spaces 
which solve certain finitely many equations. 

A concrete realization of this idea is in 
Le-Murakami (1996) and Bar-Natan (1997) (follow- 
ing ideas from Drinfel'd (1990, 1991) on quasitrian- 
gular quasi-Hopf algebras). The relevant “algebraic 
context" is a category with certain extra operations, 
and within it, knot theory is generated by just two 
elements, the braiding X and the re-association ^. 
Thus, to define Z; it is enough to find R = Z3(X) 
and “an associator" ® = Z;(|/) which satisfy certain 
normalization conditions as well as the pentagon 
and hexagon equations: 


$12 . (1A1)(9) - 8*4 = (A11)(9) - (11A)() 
(A1)(R*) = 813 (R5) (8-1) P (gs)? 


As it turns out, the solution for R is easy and 
nearly canonical. But finding an associator ® is rather 
difficult. There is a closed-form integral expression 
KZ due to Drinfel’d (1990) but one encounters the 
same not-too-well-understood multiple ¢ numbers. 
There is a rather complicated iterative procedure 
for finding an associator (Drienfel'd 1991, Bar- 
Natan 1998). On a computer it had been used to find 
an associator up to degree 7. There is also closed-form 
associator that works only with the Lie superalgebra 
gl(1]1) (Lieberum 2002). But it remains an open 


problem to find a closed-form formula for a rational 
associator (existence by Drienfel’d (1991) and 
Bar- Natan (1998)). 

On the positive side, we should note that the end 
result, the invariant Z3, is independent of the choice 
of ® and that Z; = Z1. 

There is an alternative (more symmetric and intrinsi- 
cally three dimensional, but less well-documented) 
description of the theory of associators in terms 
of knotted trivalent graphs (Bar-Natan and 
Thurston). There ought to be a perturbative invariant 
associated with knotted trivalent graphs in the spirit of 
the last subsection and such an invariant should lead to 
a simple proof that Z2 = Z3 = Z4. But the £(D) factors 
remain untamed in this case. 


Step-by-Step Integration 


The last approach for proving the fundamental 
theorem is the most natural and historically the 
first. But here it is last because it is yet to lead to an 
actual proof. A weight system W:A’, — Q is an 
invariant of m-singular knots. We want to show that 
it is the mth derivative of an invariant V of 
nonsingular knots. It is natural to try to integrate 
W step by step, first finding an invariant V^! of 
(m — 1)-singular knots whose derivative in the sense 
of [1] is W, then an invariant V"-? of (m — 2)- 
singular knots whose derivative is V"^!, and so on 
all the way up to an invariant V? — V whose mth 
derivative will then be W. If proven, the following 
conjecture would imply that such an inductive 
procedure can be made to work: 


Conjecture 15 (Hutchings 1998). If V' is a once- 
integrable invariant of r-singular knots, then it is 
also twice integrable. That is, if there is an invariant 
V'^! of (r — 1)-singular knots whose derivative is 
V", then there is an invariant V'^? of (r — 2)-singular 
knots whose second derivative is V". 


Hutchings (1998) reduced this conjecture to a 
certain appealing topological statement and further 
to a certain combinatorial-algebraic statement about 
the vanishing of a certain homology group H! which 
is probably related to Kontsevich's graph homology 
complex (Kontsevich 1994) (Kontsevich's H? is .A, 
so this is all in the spirit of many deformation theory 
problems where H? enumerates infinitesimal defor- 
mations and H' is the obstruction to globalization). 
Hutchings (1998) was also able to prove the 
vanishing of H! (and hence reprove the fundamental 
theorem) in the simpler case of braids. But no 
further progress has been made along these lines 
since then. 
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Some Further Directions 


We would like to touch upon a number of 
significant further directions in the theory of finite- 
type invariants and describe each of those only 
briefly; the reader is referred to the “Further read- 
ing” section for more information. 


The Original “Vassiliev” Perspective 


V A Vassiliev came to the study of finite-type knot 
invariants by studying the infinite-dimensional space 
of all immersions of a circle into R? and the topology 
of the “discriminant,” the locus of all singular 
immersions within the latter space (Vassiliev 1990, 
1992). Vassiliev studied the topology of the comple- 
ment of the discriminant (the space of embeddings) 
using a certain spectral sequence and found that 
certain terms in it correspond to finite-type invar- 
iants. This later got related to the Goodwillie calculus 
and back to the configuration spaces discussed in the 
last section. See Volic (2004). 


Interdependent Modifications 


The standard definition of finite-type invariants is 
based on modifying a knot by replacing over (or 
under) crossings with under (or over) crossings. 
Goussarov (1998) generalized this by allowing 
arbitrary modifications done to a knot — just take 
any segment of the knot and move it anywhere else 
in space. The resulting new “finite-type” theory 
turns out to be equivalent to the old one though 
with a factor of 2 applied to the grading (so an 
“old” type m invariant is a “new” type 2m invariant 
and vice versa). (see also Bar-Natan (2001) and 
Conant (2003)). 


n-Equivalence, Commutators, and Claspers 


While little is known about the overall power of finite- 
type invariants, much is known about the power of 
type n-invariants for any given n. Goussarov (1993) 
defined the notion of z-equivalence: two knots are 
said to be *z-equivalent" if all their type m-invariants 
are the same. This equivalence relation is well under- 
stood both in terms of commutator subgroups of the 
pure braid group (Stanford 1998, Ng and Stanford 
1999) and in terms of Habiro's calculus of surgery 
over *claspers" (Habiro 2000) (the latter calculus also 
gives a topological explanation for the appearance of 
Jacobi diagrams). In particular, already Goussarov 
(1993) shows that the set of equivalence classes of 
knots modulo 7;-equivalence is a finitely generated 
abelian group G, under the operation of connected 
sum, and the rank of that group is equal to the 
dimension of the space of type 7-invariants. 
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Ng (1998) has shown that ribbon knots generate 
an index 2 subgroup of G,. 


Polynomiality and Gauss Sums 


Goussarov (1998) (see also Goussarov-Polyak-Viro 
(2001)) found an intriguing way to compute finite- 
type invariants from a Gauss diagram presentation of 
a knot, showing in particular that finite-type invar- 
iants grow as polynomials in the number of crossings 
n and can be computed in polynomial time in n 
(though actual computer programs are still missing!). 

Gauss diagrams are obtained from knot diagrams 
in much of the same way as Chord diagrams are 
obtained from singular knots, except all crossings 
are counted and not just the double points, and 
certain over/under and sign information is asso- 
ciated with each crossing/chord so that the knot 
diagram can be recovered from its Gauss diagram. 
In the example below (Figure 8), we also dashed a 
subdiagram of the Gauss diagram equivalent to the 
chord diagram shown in Figure 7. 

If G is a Gauss diagram and D is a chord diagram, 
then let (D, G) be the number of subdiagrams of 
G equivalent to D, counted with appropriate signs 
(to be precise, we also need to base the diagrams 
involved and count subdiagrams that respect the 
basing). 


Theorem 16 (Goussarov 1998, Goussarov et al. 
2000). If V is a type m invariant, then there are 
finitely many (based) chord diagrams D; with at 
most m chords and rational numbers a; so that 
V(K) = $3,a;(D;,G) whenever G is a Gauss dia- 
gram representing a knot K. 


Figure 7 A chord diagram. 


Figure 8 A knot and its Gauss diagram. 


Computing the Kontsevich Integral 


While the Kontsevich integral Z1 is a cornerstone of 
the theory of finite-type invariants, it has been 
computed for surprisingly few knots. Even for the 
unknot, the result is nontrivial: 


Theorem 17 (“Wheels,” Bar-Natan et al. 2000, 
2003). The framed Kontsevich integral of the unknot, 
Z\(Q), expressed in terms of diagrams in B, is given 
by Q= exp, ~~ , b2nwan, where the “modified Ber- 
noulli numbers" bın are defined by the power series 
expansion $^ o ba,x*" = (1/2) log (sinh x/2)/(x/2), 
the “2n-wheel” wn is the free Jacobi diagram made 
of a 2n-gon with 2n legs (so, e.g., we = XX), and where 
exp, means “exponential in the disjoint union sense.” 


Closed-form formulas have also been given for the 
Kontsevich integral of framed unknots, the Hopf 
link and Hopf chains. 

Theorem 17 has a companion that utilizes the same 
element 2, the “wheeling” theorem (Bar-Natan et al. 
2000, 2003). The wheeling theorem “upgrades” the 
vector space isomorphism y:B— A to an algebra 
isomorphism and is related to the Duflo isomorphism 
of the theory of Lie algebras. It is amusing to note that 
the wheeling theorem (and hence Duflo's theorem in 
the metrized case) follows using finite-type techniques 
from the *1 + 1 — 2 on an abacus” identity (Figure 9). 


Taming the Kontsevich Integral 


While explicit calculations are rare, there is a nice 
structure theorem for the values of the Kontsevich 
integral, saying that for a knot K and up to any fixed 
number of loops in the Jacobi diagrams, x! Z1(K) can 
be described by finitely many rational functions (with 
denominators powers of the Alexander polynomial) 
which dictate the placement of the legs. This structure 
theorem was conjectured in Rozansky (2003), proven 
in Kricker (2000), and partially generalized to links in 
Garoufalidis and Kricker (2004). 


The Rozansky-Witten Theory 


One way to construct linear functionals on A (and 
hence finite-type invariants) is using Lie algebras 
and representations as discussed earlier; much of our 
insight about A comes this way. But there is another 
construction for such functionals (and hence invar- 
iants), due to Rozansky and Witten (1997), using 
contractions of curvature tensors on hyper-Kahler 


0-9--0 


Figure 9 A knot theoretic 1+ 1 —2. 


manifolds. Very little is known about the Rozansky- 
Witten approach; in particular, it is not known if it 
is stronger or weaker than the Lie algebraic 
approach. For an application of the Rozansky- 
Witten theory back to hyper-Kahler geometry 
check Hitchin and Sawon (2001), and for a 
unification of the Rozansky-Witten approach with 
the Lie algebraic approach (albeit at a categorical 
level) check Roberts and Willeton (in preparation). 


The Melvin-Morton Conjecture and the 
Volume Conjecture 


The Melvin-Morton conjecture (stated Melvin and 
Morton (1995), proven Bar-Natan and Garoufalidis 
(1996)) says that the Alexander polynomial can be read 
off certain coefficients of the colored Jones polynomial. 
The Kashaev-Murakami-Murakami volume conjec- 
ture (stated Kashaev (1997) and J Murakami and H 
Murakami (2001), unproven) says that a certain 
asymptotic growth rate of the colored Jones polyno- 
mial is the hyperbolic volume of the knot complement. 

Both conjectures are not directly about finite-type 
invariants but both have ramifications to the theory 
of finite-type invariants. The Melvin-Morton con- 
jecture was first proven using finite-type invariants 
and several later proofs and generalizations (see 
(Bar-Natan)) also involve finite-type invariants. The 
volume conjecture would imply, in particular, that 
the hyperbolic volume of a knot complement can be 
read from that knot's finite-type invariants, and 
hence finite-type invariants would be at least as 
strong as the volume invariant. 

A particularly noteworthy result and direction for 
further research is Gukov's (preprint) recent unifica- 
tion of these two conjectures under the Chern- 
Simons umbrella (along with some relations to 
three-dimensional quantum gravity). 


Beyond Knots 


For lack of space, we have restricted ourselves here to a 
discussion of finite-type invariants of knots. But the 
basic “differentiation” idea of the first section calls for 
generalization, and indeed it has been generalized 
extensively. We will only make a few quick comments. 
Finite-type invariants of homotopy links (links 
where each component is allowed to move across 
itself freely) and of braids are extremely well 
behaved. They separate, they all come from Lie 
algebraic constructions and in the case of braids, 
step-by-step integration as discussed previously works 
(for homotopy links the issue was not studied). 
Finite-type invariants of 3-manifolds and especially 
of integral and rational homology spheres have been 
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studied extensively and the picture is nearly a 
complete parallel of the picture for knots. There are 
several competing definitions of finite-type invar- 
iants, and they all agree up to regrading. There are 
weight systems and they are linear functionals on a 
space .A(0) which is a close cousin of .A and B and is 
related to Lie algebras and hyper-Kahler manifolds in 
a similar way. There is a notion of a “universal” 
invariant, and there are several constructions; they all 
agree or are conjectured to agree, and they are related 
to the Chern-Simons- Witten theory. 

Finite-type invariants were studied for several 
other types of topological objects, including knots 
within other manifolds, higher-dimensional knots, 
virtual knots, plane curves and doodles and more 
(see Bar-Natan). 
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Introduction 
Physics Background and Motivation 


Suppose G is a semisimple compact Lie group and 
M a closed oriented 3-manifold. Witten (1989) 
defined quantum invariants by the path integral 
over all G-connections A: 


Z(M, G; k) = | exo - Tk CS(A))DA 


where k is an integer and CS(A) is the Chern-Simons 
functional, 


CS(A) — zl. (A ^ dA +54") 


The path integral is not mathematically rigorous. 
According to the stationary-phase approximation in 
quantum field theory, in the limit k — oo the path 
integral decomposes as a sum of contributions from 
the flat connections: 


Z(M,G;k)~ X 


flat connections f 


Each contribution is exp(27/—1kCS(f)) times a 
power series in 1/k. The contribution from the 
trivial connection is important, especially for 
rational homology 3-spheres, and the coefficients 
of the powers (1/k)”, calculated using (7 + 1)-loop 
Feynman diagrams by quantum field theory techni- 
ques, are known as perturbative invariants. | 


ZN)(M,G:k) ask-— oo 


Mathematical Theories 


A mathematically rigorous theory of quantum 
invariants Z(M, G ; £) was pioneered by Reshetikhin 
and Turaev in 1990 (see Turaev (1994)). 
A number-theoretical expansion of the quantum 
invariants into power series that should correspond 
to the perturbative invariants was given by Ohtsuki 
(in the case of sly, and general simple Lie algebras 


by the author) in 1994. This led him to introducing 
finite-type invariant (FTI) theory for 3-manifolds. A 
universal perturbative invariant was constructed by 
Le-Murakami-Ohtsuki (LMO) in 1995; it is uni- 
versal for both finite-type invariants and quantum 
invariants, at least for homology  3-spheres. 
Rozansky in 1996 defined perturbative invariants 
using Gaussian integral, very close in the spirit to 
the original physics point of view. Later Habiro 
(for sl and Habiro and the author for all simple 
Lie algebras) found a finer expansion of quantum 
invariants, known as the cyclotomic expansion, but 
no physics origin is known for the cyclotomic 
expansion. The cyclotomic expansion helps to show 
that the LMO invariant dominates all quantum 
invariants for homology 3-spheres. 

The purpose of this article is to give an overview 
of the mathematical theory of finite-type and 
perturbative invariants of 3-manifolds. 


Conventions and Notations 


All vector spaces are assumed to be over the ground 
field Q of rational numbers, unless otherwise stated. 
For a graded space A, let Gr,A be the subspace of 
grading n and Grz,A the subspace of grading <n. 
For x € A, let Gr,x and Gr-,x be the projections of 
x onto, respectively, Gr,A and Gr-,A. 

All 3-manifolds are supposed to be closed and 
oriented. A 3-manifold M is an integral homology 
3-sphere (ZHS) if H41(M,Z)—0; it is a rational 
homology 3-sphere (QHS) if H41(M,Q) —0. For a 
framed link L in a 3-manifold M denote M, the 
3-manifold obtained from M by surgery along L (see 
e.g., Turaev (1994)). 


Finite-Type Invariants 


After its introduction by Ohtsuki in 1994, the theory 
of FTIs of 3-manifolds has been developed rapidly 
by many authors. Later Goussarov and Habiro 
independently introduced  clasper calculus, or 
Y-surgery, which provides a powerful geometric 
technique and deep insight in the theory. Y-surgery, 
corresponding to the commutator in group theory, 
naturally gives rise to 3-valent graphs. 


Generality on FTIs 


Decreasing filtration In a theory of FTIs, one 
considers a class of objects, and a “good” decreasing 
filtration Fo D £4» £5 D--- on the vector space 
£F =Fo spanned by these objects. An invariant of 
the objects with values in a vector space is of order 
less than or equal to n if its restriction to F,,,, is 0; 
it is of finite type if it is of order € n for some n. An 
invariant has order n if it is of order < n but not 
< n — 1. Good here means at least the space of FTI 
of each order is finite dimensional. It is desirable to 
have an algorithm of polynomial time to calculate 
every FTI. In addition, one wants the set of FTIs to 
separate the objects (completeness). 

The space of invariants of order<m can be 
identified with the dual space of Fo/Fn+1; its 
subspace F„/Fn+1 is isomorphic to the space of 
invariants of order < n modulo the space of invar- 
iants of order € n — 1. Informally, one can say that 
Fn/Fnil is more or less the set of invariants of 
order n. 


Elementary moves, the knot case Usually the 
filtrations are defined using “independent elemen- 
tary moves.” For the class of knots the elementary 
move is given by crossing change. Any two knots 
can be connected by a finite sequence of such moves. 
The idea is if K,K’ E€ F,, the mth term of the 
filtration, then K — K' € £,,1, where K’ is obtained 
from K by an elementary move. Formal definition is 
as follows. Suppose S is a set of double points of a 
knot diagram D. Let 


ID, 引 = > (-1)* Ds 


SCS 


where the sum is over all subsets S’ of S, including 
the empty set, Ds is the knot obtained by changing 
the crossing at every point in S’, and #8’ is the 
number of elements of S’. Then F,, is the vector 
space spanned by all elements of the form [D, S] 
with #5 = n. For the knot case, the Kontsevich 
integral is an invariant that is universal for all FTIs 
(see Bar-Natan (1995)). 


Ohtsuki's Definition of FTIs for 之 HS 


An elementary move here is a surgery along a 
knot: M — Mx, where K is a framed knot in a 
ZHS M. A collection of moves corresponds to 
surgery on a framed link. To always remain in the 
class of ZI HS we need to restrict ourselves to unit- 
framed and algebraically split links, that is, framed 
links in ZHS each component of which has 
framing +1 and the linking number of every two 
components is 0. It is easy to prove that a link L 
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in a ZHS M is unit-framed and algebraically split 
if and only if My is a ZHS for every sublink L’ of 
L. For a unit-framed, algebraically split link L in a 
ZHS M define 


[M,L]= >> (-1)# My, 
L'cL 


which is an element in the vector space M freely 
spanned by ZHS. 

For a non-negative integer n let 7^? be the 
subspace of M spanned by [M,L] with #L=n. 
Then the descending filtration M = FẸ > FS D 
£5? 5 ... defines a theory of FTIs on the class of 
ZHS. 


Theorem 1 
(i) (Ohtsuki) The dimension of F,,(M) is finite for 


every n. 
(ii) (Garoufalidis-Ohtsuki) One has F3n44(M)= 
F 3n42(M) = F3n+3(M). 


The orders of FTIs in this theory are multiples of 3. 
The first nontrivial invariant, which is the only (up 
to scalar) invariant of degree 3, is the Casson 
invariant. 


The Goussarov-Habiro Definition 


Y-surgery or clasper surgery Consider the standard 
Y-graph Y and a small neighborhood N(Y) of it in 
the standard R? (see Figure 1). Denote by L(Y) the 
six-component framed link diagram in N(Y) C R?, 
each component of which has framing 0 in R? 
(see Figure 1). 

A framed Y-graph C in a 3-manifold M is the 
image of an embedding of N(Y) into M. The surgery 
of M along the image of the six-component link 
L(Y) is called a Y-surgery along C, denoted by Mc. 
If one of the leaves bounds a disk in M whose 
interior is disjoint from the graph, then Mc is 
homeomorphic to M. 

Matveev in 1987 proved that two 3-manifolds M 
and M’ are related by a finite sequence of 
Y-surgeries if and only if there is an isomorphism 
from H,(M,Z) onto Hj4(M',Z) preserving the 


© 


Y-graph Its neighborhood N(Y) 


Figure 1 Y-graph. 


Surgery link L(Y) 
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linking form on the torsion group. It is natural to 
partition the class of 3-manifolds into subclasses of 
the same H, and the same linking form. 


Goussarov-Habiro filtrations For a 3-manifold M 
denote by M(M) the vector space spanned by all 
3-manifolds with Hı and linking form the same as 
those of M. Define, for a set $ of Y-graphs in M, 
[M, S] = Dy s C 1)? Ms, and FY M(M) the vector 
space spanned by all [N, S] such that N is in M(M) 
and 44$ — 5. The following theorem of Goussarov 
and Habiro (Goussarov 1999, Garoufalidis et al. 
2001, Habiro 2000) shows that the FTI theory 
based on Y-surgery is the same as the one of Ohtsuki 
in the case of ZHS. 


Theorem 2 For the case M —.M(S?), one bas 
Fi = F2. - FS. 


The Fundamental Theorem of FTis of 7 HS 


Jacobi diagrams A closed Jacobi diagram is 
a vertex-oriented trivalent graph, that is, a graph 
for which the degree of each vertex is equal to 3 and 
a cyclic order of the three half-edges at every vertex 
is fixed. Here, multiple edges and self-loops ere 
allowed. In pictures, the orientation at a vertex is 
the clockwise orientation, unless otherwise stated. 
The “degree” of Jacobi diagram is half the number 
of its vertices. 

Let Gr Al), n > 0, be the vector space spanned 
by all closed Jacobi diagrams of degree n, modulo 
the antisymmetry (AS) and Jacobi (IHX) relations 
(see Figure 2). 


The universal weight map W Suppose D is a closed 
Jacobi diagram of degree n. Embedding D into R? c S? 
arbitrarily and then projecting down onto R? in 
general position, one can describe D by a diagram, 
with over/under-crossing information at every dou- 
ble point just as in the case of a link diagram. We 
can assume that the orientation at every vertex of D 
is given by a clockwise cyclic order. From the image 
of D, construct a set G of 2m Y-graphs as in 


AS H + 1 = 0 
[ | 
i ! 
[ ! 
ES / Me 7 E ES L4 
S "d b^ ~ sS ‘yf 
IHX A we’ oy 
Li 
1 


(Jacobi) .----- E E piia Lt E e O 
- 1 
! 


Figure 2 The AS and IHX relations. 


Ges 


Figure 3 The weight map. 


Figure 3. Here only the cores of a Y-graph are 
drawn, with the convention that each framed 
Y-graph is a small neighborhood of its core in R?. 

If G' is a proper subset of G, then in G' there is a 
Y-graph, one of the leaves of which bounds a disk, 
hence $2, = $9. Thus, W(D) :- [8?, G] 2 Sb, — S?. By 
definition, W(D) € 75,; it might depend on the 
embedding of D into R?, but one can show that 
W(D) is well defined in FY /F3,.1. The map W was 
first constructed by Garoufalidis and Ohtsuki in the 
framework of ^^. 


Fundamental theorem 


Theorem 3 (Lê et al. 1998, Le 1997). The map W 
descends to a well-defined linear map 
W : Gr,.A(0) — F S IF 3 ,1 and moreover, is an 
isomorphism between the vector spaces Gr Al) 
and FiA Fina for M=M(S?). 


The theorem essentially says that the set of 
invariants of degree 2n is dual to the space of closed 
Jacobi diagram Gr,.A(0). The proof is based on the 
LMO invariant (see the next section). 

A Q-valued invariant I of order < 27 restricts to a 
linear map from F2,,/F2,41 to Q. The composition 
of I and W is a functional on Gr,.A(0) called the 
“weight system” of I. The theorem shows that every 
linear functional on Gr-,.A(0) is the weight of an 
invariant of order <2n. 


Relation to knot invariants Under the map that 
sends an (unframed) knot K c S? to the ZHS 
obtained by surgery along K with framing 1, an 
invariant of degree <2n (in the FY theory) of ZHS 
pulls back to an invariant of order <2n of knots. 
This was conjectured by Garoufalidis and proved by 
Habegger. 


Other classes of rational homology 3-spheres 
Actually, the theorem was first proved in the frame- 
work of 7^*. Clasper surgery theory allows Habiro 
(2000) to generalize the fundamental theorem to OHS: 
for M a OHS, the universal weight map W :Gr, 
A(0) — J5,.MUM)/F 553. MU(M), defined similarly as 


in the case of ZHS, is an isomorphism, and 
Frn1M(M) = FrnM(M). 


Other filtrations and approaches Other equivalent 
filtrations were introduced (and compared) by 
Garoufalidis, Garoufalidis and Levine (1997), and 
Garoufalidis-Goussarov-Polyak (2001). Of impor- 
tance is the one using subgroups of mapping class 
groups in Garoufalidis and Levine (1997). A theory 
of 7-equivalence was constructed by Goussarov and 
Habiro that encompasses many geometric aspects of 
FTIs of 3-manifolds (Habiro 2000, Goussarov 
1999). Cochran and Melvin (2000) extended the 
original Ohtsuki definition to manifolds with 
homology, using algebraically split links, but the 
filtrations are different from those of Goussarov- 
Habiro. 


The Le-Murakami-Ohtsuki Invariant 
Jacobi Diagrams 


An open Jacobi diagram is a vertex-oriented uni- 
trivalent graph, that is, a graph with univalent and 
trivalent vertices together with a cyclic ordering of 
the edges incident to the trivalent vertices. A 
univalent vertex is also called *a leg." The degree 
of an open Jacobi diagram is half the number of 
vertices (trivalent and univalent). A Jacobi diagram 
based on X, a compact oriented 1-manifold, is a 
graph D together with a decomposition D = X UT, 
such that D is the result of gluing all the legs of an 
open Jacobi diagram T to distinct interior points of 
X. The degree of D, by definition, is the degree of T. 
In Figure 4 X is depicted by bold lines. Let A’ (X) be 
the space of Jacobi diagrams based on X modulo the 
usual antisymmetry, Jacobi and the new STU 
relations. The completion of A/(X) with respect to 
degree is denoted by .A(X). 

When X is a set of m-ordered oriented intervals, 
denote A(X) by Pm, which has a natural algebra 
structure where the product DD’ of two Jacobi 
diagrams is defined by stacking D on top of D' 
(concatenating the corresponding oriented intervals). 
When X is a set of m-ordered oriented circles, 
denote A(X) by A,,. By identifying the two 
endpoints of each interval, one gets a map pr: Pm 一 


Am, which is an isomorphism if m=1 (see 
Bar-Natan (1995)). 
AN yf i - X F 
s / | | Ce. 
\ 9^ ! | \ 天 
STU T = | - Á 
I i | i x 


Figure 4 The STU relation. 
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For x € Am and y € A;, the connected sum is 
defined by x#my:=pr((pr '!x)(pr !y)?"), where 
(pr- y)?" is the element in Pm with pr^! y on each 
oriented interval. 


Symmetrization maps Let B,, be the vector space 
spanned by open Jacobi diagrams whose legs are 
labeled by elements of [1,2,...,7:], modulo the 
antisymmetry and Jacobi relations. One can define 
an analog of the Poincare-Birkhoff-Witt isomorph- 
ism x:B,, — Pm as follows. For a diagram D, x(D) 
is obtained by taking the average over all possible 
ways of ordering the legs labeled by j and attaching 
them to the jth oriented interval. It is known that x 
is a vector space isomorphism (Bar-Natan (1995)). 


The Framed Kontsevich Integral of Links 


For an m-component framed link Lc R?, the 
(framed version of the) Kontsevich integral Z(L) is 
an invariant taking values in A,, (see, e.g., Ohtsuki 
(2002)). Let v:— Z(K), when K is the unknot with 
framing 0, and Z(L):=Z(L)#,,v. An explicit for- 
mula for v is given in Bar-Natan et al. (2003). 


Removing Solid Loops: The Maps :n 


Suppose x € Bm is an open Jacobi diagram with legs 
labeled by {1,...,m}. If the number of vertices of 
any label is different from 2n, or if the degree of 
D > (m-- 1) we set t,(D)=0. Otherwise, parti- 
tioning the 2n vertices of each label into n pairs and 
identifying points in each pair, from x we get a 
trivalent graph which may contain some isolated loops 
(no vertices) and which depends on the partition. 
Replacing each isolated loop by a factor —2n, 
and summing up over all partitions, we get 
(D) € Grey AM). 

For x € Am, choose y € Pm such that pr(y) = x. 
Using the isomorphism x we pull back x^!y € Bm. 
Define 1,,(x) := (x ! y). One can prove that 1,(x) 
does not depend on the choice of the preimage y of x. 
Note that 1, lowers the degree by nm. 


Definition of the Le-Murakami-Ohtsuki 
Invariant ZLMO 


In .A(0) := [ [o Gr,.A(0) let the product of two 
Jacobi diagrams be their disjoint union. In addition, 
define the coproduct A(D)=1@D+D@®1 for Da 
connected Jacobi diagram. Then .A(0) is a commu- 
tative cocommutative graded Hopf algebra. 

For the unknot U, with framing +1, one has 
u(Z(U4)) = (1)" + (terms of degree > 1); hence, 
their inverses exist. Suppose the linking matrix of an 
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oriented framed link L C R? has c, positive eigen- 
values and o_ negative eigenvalues. Define 
_ in(Z(L)) 
(tn(Z(U.)))"* (¢n(Z(U_)))”" 
€ Grad-,(.A(0)) [1] 


QuUL) = 


Theorem 4 (Lé et al. 1998). 
of the 3-manifold M = $}. 


O,(L) is an invariant 


We can combine all the 2, to get a better 
invariant: 


Z'MO(M) := 1 + Gradi (Qı (M)) 
qure Grad, (Q,(M)) ee AM) 

For M a QHS, we also define 
Grad; (Qı (M)) 

d(M) 

Grad, (Q,(M)) 

d(M)" 

where d(M) is the cardinality of H4(M, Z). 


Proposition 1 (Lê et al. 1998). Both Z-M9 (M) and 
ZVMO(M) (when defined) are group-like elements, 
that is, 


A(ZHM9 ( M)) = ZMCM) Q 2 
A(Z'™°(M)) = ZMN M) 65 ZI. MO (M) 
Moreover, Z1 M9 (M, 4:M3) = ZIM9 (M,) x ZLMO(M2). 


Z'MO(M) = 14 


Universality Properties of the LMO Invariant 
Let us restrict ourselves to the case of ZHS. 


Theorem 5 (Lé 1997). The less than or equal to n 
degree part Grz,Z-M9 is an invariant of degree 2n. 
Any invariant of degree <2n is a compo- 
sition w(Grz,Z-M9), where w:Gr<,A(0) — Q is a 
linear map. 


Clasper calculus (or Y-surgery) theory allows 
Habiro to extend the theorem to rational homology 
3-spheres. 


The Arhus Integral 


The Arhus integral (ca. 1998) of Bar-Natan, 
Garoufalidis, Rozansky and Thurston, based on a 
theory of formal integration, calculates the LMO 
invariant of rational homology 3-spheres. The 
formal integration theory has a conceptual flavor 
and helps to relate the LMO invariant to perturba- 
tive expansions of quantum invariants. We give here 
the definition for the case when one does surgery on 


a knot K with nonzero framing b. The link case 1 
similar (see Bar-Natan et al. (2002a, b)). 

When K is a knot, Z(K) is an element of A; 
Pı = Bı. Note that B, is an algebra where the 
product is the disjoint union LI. Since the framing is 
b, one has 


an 


Z(K) = exp „(b wı/2)u Y 


where wi is the “dashed interval" (the only 
connected open Jacobi diagram without trivalent 
vertex), and Y is an element in B every term of 
which must have at least one trivalent vertex. For 
uni-trivalent graphs C, D € B; let 


O if the numbers of legs of C, D 
are different 


sum of all ways to glue legs of C and D 
together 


One defines pa Z(K) :— (exp,, (— wi/2b), Y). Then 


FG x i E 
[2o = 3; etm 
n=0 


(C, D) = 


Hence, 
[°° Z(K) 


ZIMO ($3) = 一 二 二 一 一 一 
FG = 
Í Z(Usign(b)) 


Other Approaches 


Another construction of a universal perturbative 
invariant based on integrations over configuration 
spaces, closer to the original physics approach but 
harder to calculate because of the lack of a surgery 
formula, was developed by Axelrod and Singer, 
Kontsevich, Bott and Cattaneo, Kuperberg and 
Thurston (see Axelred and Singer (1992), Bott and 
Cattaneo (1998)). 


Quantum Invariants and Perturbative 
Expansion 


Fix a simple (complex) Lie algebra q of finite 
dimension. Using the quantized enveloping algebra 
of q one can define quantum link and 3-manifold 
invariants. We recall here the definition, adapted for 
the case of roots lattice (projective group case). 

Here our q is equal to q? in the text book (Jantzen 
1995). Fix a root system of q. Let X, X,, Y denote 
respectively the weight lattice, the set of dominant 
weights, and the root lattice. We normalize the 
invariant scalar product in the real vector space of 
the weight lattice so that the length of any short root 
is V2. 


Quantum Link Invariants 


Suppose L is a framed oriented link with m-ordered 
components, then the quantum invariant 
Jili, --- Àm) is a Laurent polynomial in 4!/?P, 
where Aj, .-.,A are dominant weights, standing 
for the simple q-modules of highest weights 
M1,..., Mm, and D is the determinant of the Cartan 
matrix of q (see, e.g., Turaev (1994) and Lé (1996)). 
The Jones polynomial is the case when q— sl; and 
all the A?s are the highest weights of the funda- 
mental representation. For the unknot U with zero 
framing, one has (here p is the half-sum of all 
positive roots) 


JuQ9)2 i [| 


positive roots a 


lie a )/2 _ 


q (pla)/2 


q- (A+pla)/2 
—- (pla)/2 


We will also use another normalization of the 
quantum invariant: 


ee MM Ain) = Jr( (M,..., 


Àm) X MS 


This definition is good only for A; € X,. Note 
that each A € X is either fixed by an element of the 
Weyl group under the dot action (see Humphreys 
(1978)) or can be moved to X, by the dot action. 
We define Oj(A1,...,À,) for arbitrary A; € X by 
requiring that Or; (1,...,4,,) —0 if one of the J/'s is 
fixed by an element of the Weyl group, and that 
Qr(AM,...,À,) is component-wise invariant under 
the dot action of the Weyl group, that is, for every 
W1,...,Wm in the Weyl group, 


Oi (wy $ Ai, 0. Wm c Am) = Or(A ZEE Am) 
Proposition 2 (Lé 1996). 
the root lattice Y. 


(i) (Integrality) Then Or User 
fractional power). 

(ii) (Periodicity) When q is an rth root of 1, then 
Or (M,..., Àm) is invariant under the action of 
the lattice group rY, that is, for y1,..., y» € Y, 
Or (A1, "tty Àm) -— Or (A1 + TEs sy Am 十 rym). 


Suppose 4,...,Am are in 


Am) € Z[q*"], (no 


Quantum 3-Manifold Invariants 


Although the infinite sum 2 jy ey Q1...) 
does not have a meaning, heuristic ideas show that 
it is invariant under the second Kirby move, and 
hence almost defines a 3-manifold invariant. The 
problem is to regularize the infinite sum. One 
solution is based on the fact that at rth roots of 
unity, Or (41,...,À,) is periodic, so we should use 
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the sum with N's run over a fundamental set P, of 
the action of rY, where 


Pn Ix = c101 t --- cio |0 € e,...,c; <r} 


Here aj,...,@¢ are basis roots. For a root € of 
unity of order r, let 
F£)- MY. Ora... Aw) 


X;e(P,nY) 


If Fu, (£) Æ 0, define 


Recall that D is the determinant of the Cartan 
matrix. Let d be the maximum of the absolute values 
of entries of the Cartan matrix outside the diagonal. 


Theorem 6 (Lé 2003) 


(i) If the order r of € is coprime with dD, then 
Fy, (£) = 0. 

(ii) If Fy, (€) 4 0 then ry (E) :— TL (£) is an invariant 
of the 3-manifold M =$}. 


Remark 1 The version presented here corresponds to 
projective groups. It was defined by Kirby and Melvin 
for sls, Kohno and Takata for sl,, and by Lê (2003) for 
arbitrary simple Lie algebra. When r is coprime with 
dD, there is also an associated modular category that 
generates a topological quantum field theory. In most 
texts in literature, say Kirillov (1996) and Turaev 
(1994), another version 7? was defined. The reason we 
choose 7!" is: it has nice integrality and eventually 
perturbative expansion. For relations between the 
version 7^9 and the usual 7*, see Lê (2003). 


Examples 
i= sl; , 


When M is the Poincaré sphere and 


rish (q — SAri- g^ 
x (1 u p = (1 at gt 


Here q is a root of unity, and the sum is easily 
seen to be finite. 


Integrality The following theorem was proved for 
q=sl, by Murakami (1995) and for q=sl, by 
Takata-Yokota and Masbaum-Wenzl (using ideas 
of J Roberts) and for arbitrary simple Lie algebras 
by Lé (2003). 


Theorem 7 Su uppose the order r of € is a prime big 
enough, then T4 (£) is in Z[£] ^ Z| exp (27i/r)]. 
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Perturbative Expansion 


Unlike the link case, quantum 3-manifold invariants 
can be defined only at certain roots of unity. In 
general, there is no analytic extension of the 
function n around q— 1. In pertbrbutiwe theory, 
we want to expand the function 7$, around q=1 
into power series. For OHS, Ohtsuki (for q= slz) 
and then the present author (for all other simple Lie 
algebras) showed that there is a number-theoretical 
expansion of TE around q — 1 in the following sense. 

Suppose r is a big enough prime, and 
€ = exp(271/r). By the integrality (Theorem 7), 


w(é) e ZE = Z[a]/(1 - a - d? 477) 


Choose a representative f(g) € Z[q] of T(E). 
Formally substitute q = (4 — 1) + 1 in f(q): 


f(a) = Erb T cr1(q u 1) pe Crn—2(q - jy 


The integers c,, depend on r and the representative 
f(q). It is easy to see that cy,» (mod r) does not depend 
on the representative f(q) and hence is an invariant of 
QHS. The dependence on r is a big drawback. The 
theorem below says that there is a rational number c,,, 
not depending on r, such that c,, (mod r) is the 
reduction of either c,, or —c, modulo r, for sufficiently 
large prime r. It is easy to see that if such c, exists, it 
must be unique. Let s be the number of positive roots 
of q. Recall that £ is the rank of q. 


Theorem 8 For every OHS M, there is a sequence 
of numbers 


1 
Cn EZ le ERT 


such that for sufficiently large prime r 


ee 
Crn = (ae c, (mod r) 


wbere 


(Ear - 


r 


is the Legendre symbol. Moreover, c, is an invariant 
of order < 2n. 


The series IC —1):= OF. en(q — 1), called 
the Ohtsuki series, can be considered as the 
perturbative expansion of the function TE at qz 1. 
For actual calculation of t?! M (q — 1), see Lê (2003), 
Ohtsuki (2002), and Rozansky (1997). 


Recovery from the LMO invariant It is known that 
for any metrized Lie algebra q, there is a linear map 
Wa: Grr A(O) — Q (see Bar-Natan (1995)). 


Theorem 9 One bas 


»  W.(Gr,Z M9) p" 
n=() 


= tM (4 — 1)Lo 


This shows that the Ohtsuki series tf (gq — 1) can 
be recovered from, and hence totally determined by, 
the LMO invariant. The theorem was proved by 
Ohtsuki for sly. For other simple Lie algebras, the 
theorem follows from the Arhus integral (see Bar- 
Natan et al. (2002a, b) and Ohtsuki (2002)). 


Rozansky's Gaussian Integral 


Rozansky (1997) gave a definition of the Ohtsuki 
series using formal Gaussian integral in the impor- 
tant work. The work is only for sl?, but can be 
generalized to other Lie algebras; it is closer to the 
original physics ideas of perturbative invariants. 


Cyclotomic Expansion 
The Habiro Ring 
Let us define the Habiro ring Ziq] by 


Ziq] := lim Z{g\/((1 — a) — d)... (1 — a") 


Habiro (2002) called it the cyclotomic completion 
of Z[q]. Formally, 7 [q] i is the set of all series of the 
form 


f(a) - Y fa — a) — 4»... (1— q^) 
n=O 
fn(q) € Ziq] 


Suppose U is the set of roots of 1. If £ € U then 
(1—€)(1-—&)---(1-&)=0 if n is_big enough; 
hence, one can define f(€) for f € Z[q]. One can 
consider every f € Z[q] as a function with domain U. 
Note that f(£) € Z[£] is always an algebraic integer. 
It turns out that Z[g] has remarkable properties, 
and plays an important role in quantum topology. 

Note that the formal derivative of (1-— q) 
(1—43^)...(1— q") is divisible by (1—q) (1— 
a)... j; — q*) with k > (n — 1)/2. This means every 
element f € Z[q] has a derivative f" € Z[q], and hence 
derivatives of all orders in Z[g]. One can then 
associate to f € Z|q] its Taylor series at a root £ of 1: 


oo f(n) 
= Cue - 0 


which can also be obtained by noticing that (1 — q) 
(1—4?)...(1— q") is divisible by (9 一 zy if n is 
bigger than k times the order of €. Thus, one has a 


map Te: Z[q] > Zlélllg — €ll. 


Theorem 10 (Habiro 2004) 


(i) For each root of unity & the map T; is injective, 
that is, a function in Zlq| is determined by its 
Taylor expansion at a point in the domain U. 

(ii) if f(£) — g(£) at infinitely many roots € of prime 
power orders, then f =g in Ziq]. 


One important consequence is that Ziq] is an 

integra. domain, since we have the embedding 
i: Ziq] — Zllq — 1]. 

k general the Taylor series Tıf has 0 convergence 
radius. However, one can speak about p-adic 
convergence to f (£) in the following sense. Suppose 
the order r of € is a power of prime, r=p*. Then it is 
known that (£ — 1)" is divisible by p" if n > mk. 
Hence, Ti/(£) converges in the p-adic topology, and 
it can be easily shown that the limit is exactly f (£). 

The above properties suggest considering Z[q] as 
a class of *analytic functions" with domain U. 


Quantum Invariants as an Element of Ziq] 


It was proved, by Habiro for sl; and by Habiro with 
the present author for general simple Lie algebras, 
that quantum invariants of ZHSs belong to Ziq] 
and thus have remarkable integrality properties: 


Theorem 11 


(i) For every ZHS M, there is an invariant l5, 
Ziq] such that if € is a root of unity for which 
the quantum invariant T M (£) can be defined, 
then IS (E) =r (€). 

(ii) The Obtsuki series is equal to the Taylor series 
of Is, at 1. 


Corollary 1 Suppose M is a ZHS. 


(1) For every root of unity £, the quantum invariant 
at € is an algebraic integer, 7},(€) € Z[£]. (No 
restriction on the order of £ is required.) 

(ii) The Obtsuki series tf(g — 1) has pieger coeffi- 
cients. If € is a root of order r=p*, where p is 
prime, then the Ohtsuki series at £ converges 
p-adically to the quantum imn at £. 

(ui) The quantum invariant TE is determined by 
values at infinitely many roots of prime power 
orders and also determined by its Obtsuhki series. 

(iv) The LMO invariant totally determines the 
quantum invariants Ti}. 

Part (ii) was conjectured by R Lawrence for sl; 
and first proved by Rozansky (also for sl;). Part (iv) 
follows from the fact that the LMO invariant 
determines the Ohtsuki series; it exhibits another 
universality property of the LMO invariant. 
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See also: Finite-Type Invariants; Knot Invariants and 
Quantum Gravity; Lie Groups: General Theory; Quantum 
3-Manifold Invariants. 
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Introduction 


Morse theory allows one to reconstruct the homology 
of a compact manifold B from data obtained from the 
gradient flow of a function f:B—R, the Morse 
function. The term “Floer homology" is used to 
describe homology groups that arise from carrying 
out the same construction, but in a setting where the 
space B is replaced by an infinite-dimensional mani- 
fold (a space of maps, or a space of configurations for a 
gauge theory), and where the gradient trajectories of 
the Morse function correspond to solutions of an 
elliptic differential equation. There are two important 
types of such homology theories that have been 
extensively developed, and the study of both was 
initiated in the 1980s by Andreas Floer. In the first 
type, the elliptic equation that arises is a Cauchy- 
Riemann equation, whose solutions are pseudoholo- 
morphic maps from a two-dimensional domain into a 
symplectic manifold. In the second type, the elliptic 
equation is an equation of gauge theory on a 
4-manifold: either the anti-self-dual Yang-Mills 
equations or the Seiberg-Witten equations. Important 
antecedents of Floer's work included work of Conley, 
Zehnder, and others on the symplectic fixed-point 
problem, and Witten's ideas about Morse theory. 

This article describes the background material 
from Morse theory before discussing Floer homol- 
ogy of Cauchy-Riemann type and its application to 
the Arnol'd conjecture in symplectic topology. Floer 
homology in the context of four-dimensional gauge 
theories is discussed more briefly. 


Morse Theory 


Let B be a smooth, compact manifold and f : B— R 
a smooth function. À critical point p of f is said to 
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be nondegenerate if the Hessian of f is a nonsingular 
operator on TB. The function f is a Morse function 
if all its critical points are nondegenerate. In the 
presence of a Riemannian metric g on B, the 
derivative df becomes a vector field, the gradient 
Vf, and we can consider the downward gradient- 
flow equation for a path x(s) in B: 

dx 

ds > Vf (x) 
If p and q are nondegenerate critical points, let us 
write Mí(p,q) for the space of solutions x(s) 
satisfying 


lim x(s) = p 
pte) = d 


To understand the structure of M(p, q), consider the 
linearization of the gradient-flow equation at a 
solution x € M(p,q). This is a linear equation for a 
vector field X along the path x in B, and takes the 
form 


Vajasx = —VVf(X) [1] 


where VVf is the covariant derivative of the 
gradient Vf, an operator on tangent vectors. Let ex 
be the dimension of the space of solutions X to this 
linear equation, with the boundary conditions 
lim, +00 X(s) 20, and let € be the dimension of 
the space of solutions to the adjoint equation 


Va/asX = +VVF(X) 


We say that the trajectory x is “regular” if € — 0. In 
this case, the trajectory space M(p,q) has the 
structure of smooth manifold near x: its dimension 
is €, and its tangent space is the space of solutions X 
to [1]. The gradient flow is said to be Morse-Smale 
if all trajectories between critical points are regular. 
If f is any Morse function, one can always choose 
the metric g so that the corresponding flow is 
Morse-Smale. (It is also the case that one can leave 
g fixed and perturb f to achieve the same effect.) 


In the Morse-Smale case, each M(p,q) is a 
smooth manifold. The dimension of M(p,q) in the 
neighborhood of a trajectory x depends only on 
p and q, not otherwise on x. Indeed, even without 
the regularity condition, the index of eqn [1], 
namely the difference ex — e, is given by 


€x — €, — index(p) — index(q) 


where index(p) denotes the number of negative 
eigenvalues (counting multiplicity) of the Hessian 
at p. In the Morse-Smale case therefore, the 
dimension of M(p,q) is given by index(p) 一 
index(q). If x(s) is a solution of the gradient-flow 
equation, then so is the reparametrized trajectory 
x(s + c); and this is different from x(s) as long as 
p#q. Let us denote by M(p,q) the quotient of 
M(p, q) by the action of R given by these reparame- 
trizations. We have 


dim M(p,q) = index(p) — index(qg) — 1 (p 7 q) 


as long as the trajectory space is nonempty. 

Let F- denote the field with two elements. The 
Morse complex of a Morse-Smale gradient flow, 
with coefficients in F5, is defined as follows. For 
each i, let C;(f) be the finite-dimensional vector 
space over F5 having a basis 

disi is 


indexed by the critical points p;,...,p,, with index i. 
For each pair of critical points p and q with indices 
i and i—1 respectively, let p € F2 denote the 
number of points in the zero-dimensional manifold 
M(p,q), counted mod 2: 


Ópg = #M(p, q) (mod 2) 


The Morse-Smale condition ensures that the zero- 
dimensional space M(p, q) is finite, so this definition 
is satisfactory. Define a differential 


6 : C(f) + G-1(f) 
by 


ó(e5) = > 


index(q)=i—1 


Opqeq 


The first important fact is that 6 really is a 
differential: as long as the flow is Morse-Smale, 
we have 


the composite 60 6: C;(f) + Ci-2 (f) is zero [2] 


We can therefore construct the homology of the 
complex (C,(f), 4). This is the Morse homology: 


jpy ker(ó : Ci(f) ^ Ci-1(f)) 
MN =e GAT GY © 
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The proof of [2] is as follows. Suppose that p has 
index i and r is a critical point with index i — 2, and 
consider M(p,r), which has dimension 1. The key 
step is to understand that M(p,r) is noncompact, 
and that its ends correspond to “broken trajec- 
tories”: pairs (x1,x2) (modulo reparametrization), 
where x; is a gradient trajectory from p to some q of 
index i — 1, and x2 is a trajectory from q to r. The 
number of ends is thus 2,, 6gr5pq. Since the number 
of ends of a 1-manifold is even, this sum is zero in 
Fz. This sum is also the matrix entry of 6 o 6 from e; 
to e,; so 606=0. 

The main result about Morse homology in finite 
dimensions is the following: 


Theorem 1 The Morse homology Hi(f) is iso- 
morphic to the ordinary homology of the compact 
manifold B with coefficients Fz: the group H;(B; F2). 


This result can be proved by first showing that 
H;(f) depends only on B, not on the choice of f or 
the metric. (This step can be accomplished by 
examining a nonautonomous flow of the form 
dx/ds— —Vf(s,x). Then one can examine the 
Morse complex in the case of a self-indexing 
Morse function (where the value of f at the critical 
points is a monotone-increasing function of their 
index). In the self-indexing case, the unstable 
manifolds of the critical points give rise to a cell 
decomposition of the manifold B, and the Morse 
complex is easily identified with the cellular chain 
complex for this cell decomposition. 

The sum of the dimensions of the Morse 
homology groups cannot be larger than the sum of 
the dimensions of the chain groups C;(f), which is 
the total number of critical points. The above 
theorem therefore implies the following basic ver- 
sion of the *Morse inequalities": 


Corollary 2 The number of critical points of a 
Morse function f:B—R cannot be less than 
> dim H;(B; F;). 


The Morse complex can be refined in various 
ways. For example, one can use integer coefficients 
in place of coefficients F by taking account of 
orientations of the spaces of trajectories. One can 
also introduce Morse theory with coefficients in a 
local system, and in both these cases a version of the 
above theorem continues to hold. One can also 
study the Morse complex of a multivalued Morse 
function: that is, one can start with closed 1-form a 
on B, with nontrivial periods, and study the flow 
generated by the corresponding vector field —g ‘a. 
Such a theory was developed by Novikov. 

The Morse complex can be generalized in a 
different direction, replacing f by a functional 
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related to a geometric problem. The canonical 
example of this (and one of the very few cases in 
which the theory works as in the finite-dimensional 
case) is the case that B — LW is the space of loops 
u:S!— W in a Riemannian manifold W and f is the 
"energy function," fg(u)— | (du/ dt)* dt. If the 
Morse-Smale condition holds, then the Morse 
homology Hi;(fe) computes the homology of LW, 
as expected. Critical points of fg are geodesics, and 
the relationship between geodesics and the topology 
of LW, for which Corollary 2 provides a prototype, 
is an idea with many applications. 

For the energy functional, the downward gradient- 
flow equation is a parabolic equation (the ordinary 
heat equation if the target space is Euclidean), and 
a solution to the flow exists for each choice of 
initial condition. Floer homology can be loosely 
characterized as the Morse theory of certain 
variational problems for which the gradient-flow 
equation is not parabolic, but elliptic of first order: 
the important models are the Cauchy—Riemann 
equation in dimension 2, the anti-self-dual Yang- 
Mills equations in dimension’ 4, or the closely 
related Seiberg—Witten equations. For an elliptic 
equation, one does not expect to solve the Cauchy 
problem with arbitrary initial condition; so with 
Floer homology, one is studying a functional for 
which the gradient flow is not everywhere defined. 
However, to define the Morse complex, the import- 
ant thing is only that we have a good understanding 
of the trajectory spaces M(p, q), which will now be 
solution spaces for an elliptic problem of geometric 
origin. The proof of Theorem 1 depends very much 
on the fact that the flow is everywhere defined: this 
theorem will therefore fail for the Morse complexes 
arising in Floer theory, and one must look else- 
where for a means to compute the Morse homology 
groups. 

Before discussing Floer homology in more specific 
terms, we shall describe the problem in symplectic 
geometry that motivated its development. 


The Arnol’d Conjecture 


A symplectic manifold of dimension 27 is a smooth 
manifold W equipped with a 2-form w which is 
closed and nondegenerate. On a symplectic mani- 
fold, one can associate to each smooth function 
H:W—R a vector field Xy on W: the vector field 
is characterized by the property that 


u(Xy, V) = dH(V) 


for all vector fields V. In this situation, one refers to 
H as the Hamiltonian and Xj; as the corresponding 


Hamiltonian vector field. If W is compact, or if Xp 
is otherwise complete, then this vector field gener- 
ates a flow ¢;:W— W(t € R). We also wish to 
consider the case that H is time dependent: we 
suppose that H;: W—R is a Hamiltonian which 
varies smoothly with £ € R and is periodic, in that 
H;,1 = H;. In this case, there is a time-dependent 
Hamiltonian vector field X, and we can consider 
the flow 4; that it generates: so for x € W, the path 
pi(x) will be the solution to 


Egila) = X) 4 


with initial condition $o(x)— x. The Arnol'd con- 
jecture, in one formulation, concerns the 1-periodic 
solutions to this equation, or equivalently the fixed 
points of à; : W — W. A fixed point x with (x) =x 
is called nondegenerate if dd; : Ty X — T,X does not 
have 1 as an eigenvalue. With this understood, one 
version of the conjecture states: 


Conjecture 3 Suppose W is compact and let H; be 
any 1-periodic, time-dependent Hamiltonian. If the 
fixed points of $4 are all nondegenerate, then the 
number of fixed points is not less than the sum of 
the Betti numbers of the manifold W. 


There is another, more general version of this 
conjecture. Let Lc W be a closed Lagrangian 
submanifold: that is, an n-dimensional submanifold 
such that the restriction of w to L as a 2-form is 
identically zero. Let L’ C W be another Lagrangian, 
obtained from L by a Hamiltonian isotopy: that is, 
L’ is $1(L), for some flow 内 generated by a time- 
dependent Hamiltonian H, as above. 


Question 4 If L and L’ intersect transversely, is it 
always true that the number of intersection points of 
L and L’ is at least the sum of the Betti numbers of 
the manifold L: 


#(LOL') > 》 rankH;(L)? 


This is phrased as a question rather than a 
conjecture, because the answer is certainly “no” in 
some cases. For example, L might be a circle 
contained in a small disk in a symplectic 2-manifold, 
in which case there is no reason why $4 should not 
move the disk to be completely disjoint from itself. 
Nevertheless, with extra hypotheses, it is known 
that the answer is often “yes.” 

We can exhibit Conjecture 3 as a special case of 
Question 4, as follows. Given a symplectic manifold 
(V, w), we can form the product W = V x V, with the 
symplectic form ww = —pjw + pw, where the p; are 
the two projections. The result of this definition is 


that the diagonal in Vx V 


submanifold, 


is a Lagrangian 


LCWEÉVExV 

for this symplectic form. Let H; be a time-dependent 
Hamiltonian on V, and let ¢,;: V — V be the flow. 
Then H;op2 is a time-dependent Hamiltonian 
generating a flow on W. For the flow on W, the 
image L’ of the diagonal LC W at time 1 is the 
graph of ¢,: V — V. Thus, (LN L’) can be identified 
with the set of fixed points of $1 in V, and an 
affirmative answer to Question 4 for L C W implies 
Conjecture 3 for V. 

Conjecture 3 and Question 4 can both be 
extended to the case of isolated degenerate fixed 
points of di for Conjecture 3, or to the case of 
isolated, nontransverse intersections for Question 4. 
For example, one can ask whether, in the non- 
transverse case, the sum of the intersection multi- 
plicities can ever be less than the sum of the Betti 
numbers. 


Morse Theory and the Arnol'd Conjecture 


The Arnol'd conjecture, and the related Question 4, 
can both be studied by reformulating them as 
questions about the number of critical points of a 
carefully chosen functional. 

We begin with the situation addressed by Con- 
jecture 3. For simplicity, we suppose that 72(W) is 
zero. Let B be the space of smooth, null-homotopic 
loops in W: 


B = {u : S! — W|u is smooth and null homotopic} 


This is a smooth, infinite-dimensional manifold. 
There is a natural functional fo: B-— R, the sym- 
plectic action, defined as 


fo(u) = [vw 


where v:D^— W is any extension of the map 
u:S!— W. The extension v exists because z is null 
homotopic, and the value of fo is independent of the 
choice of v because 72(W) — 0. This functional can be 
modified in the presence of a periodic Hamiltonian. 
Introduce a coordinate t on S! with period 1, and so 
regard 4 as a periodic function of t. Write the 
Hamiltonian as H; as before, and define 


= fo(u j+ f ry 
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To compute the first variation of f, consider a one- 
parameter family of loops u(t) = u(s, t) parametrized 
by s € R. We compute 


d ! (du Ou . Ou 
EI (oe. A) as f au (2) dt 


| (du Ou 


using the relationship between dH, and X;. Thus, a 
loop u € B is a critical point of f : B — R if and only 
if it is a solution of the equation 


A. X;(u(t)) [5] 


This means that there is a one-to-one correspon- 
dence between these critical points and certain 
I-periodic solutions of eqn [4]: these in turn 
correspond to fixed points p of @, with the 
additional property that the path ó;(p) from p to 
p is null homotopic. 

To consider the. formal gradient flow of the 
functional f, on must introduce a metric on B. A 
Riemannian metric g on the symplectic manifold 
(W,w) is compatible with w if there is an almost- 
complex structure /: TW — TW such that 
w(X, Y) — g(JX, Y) for all tangent vectors X and Y 
at any point of W. Let g; be a 1-periodic family of 
compatible Riemannian metrics on W. Using these, 
on can define an inner product on the tangent 
bundle of B by the formula 


l 
(U,V) = | g(U(t), V(t)) dt 


in which U and V are tangent vectors at z € B, 
regarded as vector fields along the loop 4 in W. We 
can rewrite the above formula for the variation of 
f in terms of this inner product: 


ee) 


where J; is the almost-complex structure corre- 
sponding to g;. Formally then, a one-parameter 
family of loops u(s,t) is a solution of the downward 
gradient-flow equations for the functional f with 
respect to this metric, if u satisfies the differential 
equation 


Ou Ou 
3, Jt (5r - X«) o [6] 


In the absence of the term X;, and with W replaced 
by C” with the standard J, this equation becomes the 
Cauchy-Riemann equation du/dz=0, for a function 
4 of the complex variable z — s + it, periodic in t. 
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Let us now suppose we are in the situation of 
Conjecture 3, so W is closed, and the fixed points of 
$1 are nondegenerate. As we have seen, each fixed 
point p of ¢; corresponds to a 1-periodic solution tp 
of eqn [5], a critical point of f. For each pair of fixed 
points p and q, introduce M(p,q) as the space of 
solutions of the formal gradient-flow equations of f, 
running from p to q: that is, M(p,q) is the space of 
maps u:R x S! — W satisfying eqn [6], with 


im u(s,t) = up(t) 
im u(s,t) = u,(t) 


With these definitions in place, one can follow the 
same sequence of steps that we outlined previously 
in the context of finite-dimensional Morse theory, to 
construct the Morse complex. First, if u belongs to 
M(p,q), we can consider the linearization at u of 
eqns [6], to obtain the counterpart of eqn [1]. These 
are linear equations for a vector field U(s, t) along u 
in W, and take the form 


VajasU + JiVajnaU + b(U) — 0 [7] 


where / is a linear operator of order zero. Let ex 
denote the dimension of the space solutions U which 
decay at s = +00, and let €, denote the dimension of 
the space of solutions of the formal adjoint 
equation. Elliptic theory for the Cauchy-Riemann 
equation, and the nondegeneracy condition for up 
and tg, mean that the operator that appears on the 
left-hand side of the equation is Fredholm: so both 
ce, and e, are finite, and the index e,— €, is 
deformation invariant. This index depends only on 
p and q: we give it a name, 


€, — €, = index(p, q) 


As before, u is said to be regular if €, is zero. For 
suitable choice of the almost-complex structures J; 
(or equivalently the metrics g;), the Morse-Smale 
condition will hold: that is, the trajectories in all 
spaces M(p, q) are regular. In this case, each M(p, q) 
is a smooth manifold and has dimension index(p,4) 
if it is nonempty. 

The “relative index” index(p, 4) plays the role of 
the difference of the Morse indices in the finite- 
dimensional case. It can be defined whether or not 
M(p, q) is empty by considering an equation such as 
[7] along an arbitrary path u(s, t). In general, there is 
no natural way to define the “index” of p: if we 
wish, we can select one fixed point po and declare it 
to have index zero; we can then define index(p) as 
index(p, po). Alternatively, we can regard the critical 
points as indexed by an affine copy of Z (without a 
preferred zero). 


Imitating the construction of the Morse complex, 
we define a vector space CF, over Fz as having a 
basis consisting of elements ep indexed by the fixed 
points p. We then define 6: CF, — CF, by 


b€p = y 


index(p,q)=1 


Ópg€q 


where 6, is defined by counting points in M(p, q) as 
before. The vector space CF, is Z-graded if we make 
a choice of critical point po to have index zero; 
otherwise, CF, has an "affine" Z-grading. The map 
6 maps CF; into CF; ,. 

To show that ó is well defined, and to show that 
606=0, one must show that the zero-dimensional 
spaces M(p, q) are compact, and that the ends of 
the one-dimensional spaces M(p,r) correspond 
bijectively to broken trajectories, as in the finite- 
dimensional case. Both of these desired properties 
hold, under the Morse-Smale conditions; but this is 
a very special feature of the specific problem. 
Without the hypothesis that 72(W) is zero, addi- 
tional noncompactness can arise from the following 
“bubbling” phenomenon. There could be a 
sequence of solutions u’ € M(p,q) to eqns [6], and 
a point (so,to) in R x St, such that for suitable 
constants e; converging to zero, the rescaled 
solutions 


it (c, T) u' (so + ej, to + €jT) 


converge on compact subsets of the plane R* to a 

. NN 1 
nonconstant pseudoholomorphic map z: CP" — W, 
or more precisely a solution of the equation 


(In the original coordinates, the derivatives of the w’ 
would grow like 1/e; near (so,to).) A pseudoholo- 
morphic sphere always has nontrivial homology 
class (and therefore nontrivial homotopy class); so 
this sort of noncompactness does not occur when 
713(W) — 0. 

Granted the compactness results, the proof that 
6 o 6 — Q runs as before, and we can construct a Floer 
homology group, 


HF, = ker(6) /im(6) 


Unlike the Morse homology of the energy func- 
tional, the Floer homology does not yield the 
ordinary homology of B. To compute it, one first 
shows that it depends only on the symplectic 
manifold (W,w), not on the choice of Hamiltonian 
H; or metrics g;: this step is similar to the proof that 
the finite-dimensional Morse homology H,(f) does 
not depend on the Morse function. Once one has 


established this independence, HF, can be computed 
by examining a special case. Floer did this by taking 
the Hamiltonian to be independent of t and equal to 
a small negative multiple -nh of a fixed Morse 
function hb: W— R on the symplectic manifold. If 
the multiple 7 € R is small enough, the only fixed 
points of di are the stationary points of the flow, 
and these are exactly the critical points of P. 
Furthermore the only index-1 solutions of eqn [6] 
for small 7 are the solutions z(s,t) with no ft 
dependence; and these are the solutions of 
du/ds— —nVh, the downward gradient flow of P5, 
scaled by 7. In this case therefore, the Floer complex 
CF, is precisely the Morse complex C,(h) of the 
Morse function 4, and Theorem 1 yields: 


Theorem 5 For a periodic, time-dependent Hamil- 
tonian H, on a closed symplectic manifold (W,w) 
with 72(W)=0, the Floer homology HF, is iso- 
morphic to the ordinary homology of W with Fz 
coefficients, H,(W; F2). 


Because the generators of CF, correspond to fixed 
points p of @; such that the path ó,(p) is null 
homotopic, the number of these fixed points is not 
less than the dimension of HF,, and therefore not 
less than 5>,dimH;(W;l2) because of the above 
result. The sum of the mod 2 Betti numbers is at 
least as large as the sum of the ordinary Betti 
numbers (the dimensions of the rational homology 
groups); so one deduces, following Floer, 


Corollary 6 The Arnol'd conjecture (Conjecture 3) 
holds for symplectic manifolds (W,w) satisfying the 
additional condition 72.(W)=0. 


Orientations can be introduced rather as in the 
case of finite-dimensional Morse theory, allowing 
one to define Floer groups with arbitrary 
coefficients. 

The Arnol'd conjecture is now known to hold in 
complete generality, without the hypothesis on 7. 
The proof has been achieved by successive exten- 
sions of the Floer homology technique. When 72(W) 
is nonzero, the space B is not simply connected. The 
first complication that arises is that the symplectic 
action functional fo, and therefore f also, is multi- 
valued. This is not an obstacle initially, because Vf 
is still well defined, and the spaces M(p,q) of 
gradient trajectories can still be assumed to satisfy 
the Morse-Smale condition: this is the type of 
Morse theory considered by Novikov, as mentioned 
above. Because 7,(B) is nontrivial, M(p, q) is a union 
of parts M,(p,q), one for each homotopy class of 
paths from p to q. For each homotopy class z, we 
have the index index,(p,qg), which is the dimension 


of Ms(p, q). 
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The spaces M;(p,q) may now have additional 
noncompactness, due to the presence of pseudo- 
holomorphic spheres “#:CP'— W. The simplest 
manifestation is when a sequence uv’ in M,(p,q) 
“bubbles off” a single such sphere at a point (so, to), 
and converges elsewhere to a smooth trajectory z/ in 
Mz(p,q), belonging to a different homotopy class. 
Let o be the homology class of the sphere #. Because 
the sphere has positive area, the pairing of o with 
the de Rham class [w] is positive: ([w],a) > 0. The 
indices are related by 


index; (p,q) = index;(p.q) — 2(&1(W). 0) 


where c41(W) € H?(Wi; Z) is the first Chern class of a 
compatible almost-complex structure. The symplec- 
tic manifold is said to be “monotone” if, in real 
cohomology, cı( W) is a positive multiple of [w]. In 
the monotone case, we always have index,(p,q) < 
index,(p,q), and no bubbling off can occur for 
trajectory spaces M,(p,q) of index 2 or less: the 
above formula either makes M,(p,q) a space 
of negative dimension (in which case it is empty) 
or a zero-dimensional space (in which case one 
has to exploit an additional transversality argument, 
to show that the holomorphic spheres belonging 
to classes o with (c1(W),0) — 1 cannot intersect one 
of the loops up in W). Since the construction of 
HF, involves only the trajectories of indices 1 and 2, 
the construction goes through with minor changes. 
Because index,(p,q) depends on the path z, 
the group HF, will no longer be Z-graded: the 
grading is defined only modulo 2d, where d is the 
smallest nonzero value of (ci(W),o) for spherical 
classes c. 

In the case that W is not monotone, additional 
techniques are needed to deal with the essential 
noncompactness of the trajectory spaces. These 
techniques involve (amongst other things) multi- 
valued perturbations on orbifolds — a strategy that 
requires the use of rational coefficients in order to 
perform the necessary averaging. For this reason, in 
the monotone case, the Arnol'd conjecture is known 
to hold only in its original form: with the ordinary 
(rational) Betti numbers. 

To address Question 4 for Lagrangian intersec- 
tions, a closely related Floer homology theory is 
used. Assume L is connected, and introduce the 
space of smooth paths joining L to L’: 


Q(W; L,L) 
= (un:[0,1]^5W | (0) € L,»u(1) € L') 


Fix a point xo in L, and let uo be the path 
uo(1) — (xo). Let B be the connected component 
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of O(W;L,L') containing uw. On B we have a 
symplectic action functional, defined as 


Fe) à E dix 


where v:[0,1] x [0,1] > W is a path in B with 
v(0,2) —- uo(t) and v(1,t)=u(t). The symplectic 
action is single valued if 7;(W, L) is trivial (even 
though this condition does not guarantee that B is 
simply connected). The critical points of f corre- 
spond to constant paths whose image in W is an 
intersection point of L and L’ (though not all such 
constant paths belong to the connected component 
B). If we fix a one-parameter family of compatible 
metrics g; and almost-complex structures J; on W, 
then we can consider the downward gradient 
trajectories of the functional. These are maps 


u:Rx/[0,1—^W 


satisfying the Cauchy-Riemann equation 


Ou Ou 
未 十 由 (S ra 0 


with boundary conditions u(s,0) € L and u(s, 1) € L’. 
With coefficients F2, a Morse complex can be 
constructed much as in the case just considered. If 
72(W, L) is trivial, then the Floer homology group HF, 
obtained as the homology of this Morse complex is 
isomorphic to H,(L; F2); and as a corollary, Question 
4 has an affirmative answer in this case. 

Without the hypothesis that 72(W, L) is trivial, 
one does not expect an affirmative answer to 
Question 4 in all cases. There is a “monotone” 
case, in which HF, can always be defined; but it is 
not always isomorphic to H,(L; F2): instead, there is 
a spectral sequence relating the two. In the general 
case, there is once again the need to use rational 
coefficients in place of mod 2 coefficients, in order 
to deal with the orbifold nature of the trajectory 
spaces that appear. This raises the question of 
orientability for the trajectory spaces. In contrast 
to the Morse theory for Hamiltonian diffeomorph- 
isms, there is an obstruction to orientability, 
involving spin structures on L and W. Even when 
the trajectory spaces are orientable, there are further 
obstructions to the existence of a Morse differential 
satisfying 6 o 6 — 0. The theory of these obstructions 
is developed in Fukaya et al. (2000). There are still 
open questions in this area. 


Instanton Floer Homology 


A *Floer homology theory" for 3-manifolds should 
assign to each 3-manifold Y (satisfying perhaps some 


additional topological requirements) a group, say 
HF(Y). Furthermore, given a  four-dimensional 
cobordism W from Y; to Y2, the theory should 
provide a corresponding homomorphism of groups, 
from HF(Y;) to HF(Y;). These homomorphisms 
should satisfy the natural composition law for compo- 
site cobordisms. One can formulate this by considering 
the category in which an object is a closed, connected, 
oriented 3-manifold Y, and in which the morphisms 
from Yı to Y; are the oriented four-dimensional 
cobordisms, considered up to diffeomorphism. A 
Floer homology theory is then a functor from this 
category (perhaps with some additional decorations or 
restrictions) to the category of groups. Such a functor 
was constructed by Floer (1988a), at least for the full 
subcategory of homology 3-spheres (manifolds Y with 
H4 x (Y; Z) 2 0). We outline the construction. 

Let P — Y be a principal SU(2) bundle (necessarily 
trivial). Let .A denote the space of SU(2) connections 
in the bundle P, and let Ap be any chosen basepoint 
in A. Any other A € A can be written as Ao + a, for 
some 1-form a with values in the adjoint bundle 
ad(P) whose fiber is the Lie algebra su(2). So A is an 
affine space, 


A = Ap + Q! (Y; ad(P)) 


and we can identify the tangent space TAA at any 
A with Q'(Y;ad(P). The Chern-Simons functional 
is a smooth function 


CS: A—5R 


depending on our choice of a reference connection 
Ao. It can be defined by stating that its derivative at 
A € Ais the linear map TAA — R given by 


ür— J tr(a ^ FA) 
Y 


where Fa denotes the curvature of A, as an ad(P)- 
valued 2-form on Y, and tr denotes the trace of a 
matrix-valued 3-form. If we equip Y with a 
Riemannian metric, then we have the L^ inner 
product on Q! (Y;ad(P)), with respect to which we 
can consider the gradient of CS. The formal down- 
ward gradient-flow equation on A is then 


(d/ds)A = — « Fa [8] 


where * is the Hodge star on Y. If A(s) is a solution 
defined on an interval [s1,s2], then we can form the 
corresponding four-dimensional connection A on 
[51,52] x Y, and eqn [8] implies that A is a solution 
of the anti-self-dual Yang-Mills equation, F; =0. 
Here FA is the self-dual part of the curvature 2-form 
on the cylinder. The critical points of CS are the flat 
connections on Y, with F4 — 0. 


Let G denote the gauge group, by which we mean 
the group of automorphisms of P. When a trivializa- 
tion of P is chosen, G becomes the group of smooth 
maps g: Y — SU(2). A connection A € A is irreducible 
if its stabilizer in G consists only of the constant gauge 
transformations +1. The functional CS is invariant 
only under the identity component of G: it descends to 
a function CS:A/G—R/(47*Z). If we choose a 
basepoint in Y, then the gauge-equivalence classes of 
flat connections in A are in one-to-one correspond- 
ence with conjugacy classes of representations, 


p: mı( Y) — SU(2) 


Given representations p and g, we write M(p,c) for 
the quotient by G of the space of trajectories A(s) 
which satisfy the gradient-flow equation [8] and 
which are asymptotic to flat connections belonging 
to the classes p and o as s — too. There is a purely 
four-dimensional interpretation of M(p, v): it can be 
identified with the moduli space of solutions A to 
the anti-self-dual Yang-Mills equation, or “instan- 
tons," on R x Y, satisfying the same asymptotic 
conditions. 

One defines the *instanton Floer homology" of Y, 
roughly speaking, as the Morse homology arising 
from the functional CS. In the case that Y is a 
homology 3-sphere, Floer defined L(Y) as the 
homology H,(C, 6) of a complex C whose generators 
correspond to the irreducible representations p, and 
whose differential ó is defined in terms of the one- 
dimensional components of the moduli spaces 
M(p,c). To carry out the construction of L,(Y), it 
is necessary to perturb the functional CS to achieve 
a Morse-Smale condition: this is done by adding a 
function f : .A — R defined in terms of the holonomy 
of connections along families of loops in Y. The 
group G is not connected, and for given p and oc, the 
moduli space Mí(p,c) has components differing in 
dimension by multiples of 8. For this reason, I, (Y) is 
a Z,/8-graded homology theory. It is a topological 
invariant of Y, and is functorial for cobordisms, in 
the manner outlined at the beginning of this section. 

Various extensions have been made, to allow the 
definition of I, (Y) for 3-manifolds with nontrivial H4, 
and to incorporate the reducible representations. 
Although there have been some successes (Donaldson 
2002), a completely satisfactory general theory has not 
been constructed. The main difficulties stem from the 
noncompactness of the instanton moduli spaces (a 
bubbling phenomenon) and the interaction of this 
bubbling with the reducible solutions. 

The instanton Floer theory for 3-manifolds is 
closely tied up with Donaldson's polynomial invari- 
ants of closed 4-manifolds, which are also defined 
using the anti-self-dual Yang-Mills equations. 
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Seiberg-Witten Floer Homology 


Seiberg—Witten Floer homology can be defined in a 
manner very similar to the instanton case. Again, we 
start with a Riemannian 3-manifold Y, equipped 
now with a spin® structure s: a rank-2 Hermitian 
vector bundle $ — Y together with a Clifford multi- 
plication o: A*'(Y)— End(S). The configuration 
space C is defined as the space of pairs (A,4), 
where A is a spin^ connection and ® is a section of S. 
In place of the Chern-Simons functional considered 
above, we have the Chern-Simons-Dirac functional 


CSD:C — R defined by 
CSD(A, ®) = 4 CS(tr(A)) + 5 [@ Da®) du 
Y 


where tr(A) denotes the connection induced by A on 
the line bundle A^S and D4 is the Dirac operator for 
the connection A. The functional is invariant again 
under the identity component of the gauge group G, 
which this time is the group of maps g:Y — S!, 
acting as automorphisms of S. The critical points are 
the solutions (A,®) to the three-dimensional 
“Seiberg-Witten equations,” 


1 P( Fira) ) E ($o*), = 0 
Da =0 


in which the subscript 0 denotes the traceless part of 
the endomorphism. If a and 5 are gauge-equivalence 
classes of critical points, then we write M(a, 3) for 
the quotient by G of the space of gradient trajec- 
tories from a to 5. 

As in the instanton case, M(a,) has a four- 
dimensional interpretation: it is the quotient by the 
four-dimensional gauge group of a space of solu- 
tions (A,®) on RxY to the four-dimensional 
Seiberg- Witten equations: 


}p( Fit ) — (69), = 0 
Di® = 0 


Here ® is a section of the summand $* of the four- 
dimensional spin! bundle S=S*@S~, and Dj: 
P(S*) —T(S~) is the four-dimensional Dirac operator. 

The action of the gauge group on C is free except 
at configurations with ®=0. These reducible con- 
figurations have an S! stabilizer. Reducible critical 
points of CSD correspond to flat connections in the 
line bundle A^S. We can now distinguish two cases, 
according to whether c;(S) is a torsion class or not. 

If c1(S) is not a torsion class, then there are no flat 
connections in A?$, so all critical points are 
irreducible. In this case, there is a straightforward 
Floer-type Morse theory for the functional CSD on 
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the space C/G: for generators of our complex we 
take the gauge-equivalence classes of critical points, 
and we use the one-dimensional trajectory spaces 
M(a, 3) to define the boundary map. The resulting 
Morse homology group is denoted HM,(Y, s). It has 
a canonical Z/2-grading, and is a topological 
invariant of Y and its spin“ structure. 

If c1(S) is torsion, the theory is more complex. 
There will be reducible critical points, and one 
cannot exclude these from the Morse complex and 
still obtain a topological invariant of Y. One may 
incorporate the reducible critical points in two 
different ways, that are in a sense dual to one 
another; and there is a third homology theory that 
one can define, using the reducibles alone. Thus, one 
can construct three Floer groups associated to Y 
with the spin^ structure s. The resulting theory 
closely resembles the Heegaard Floer homology that 
is described next. 


Heegaard Floer Homology and Other 
Floer Theories 


Heegaard Floer homology is a Floer homology 
theory for 3-manifolds that is formally similar to 
Seiberg- Witten Floer homology, and conjecturally 
isomorphic to it. Unlike the instanton and Seiberg- 
Witten theories, its construction, due to Ozsváth 
and Szabó, does not use gauge theory. Instead, one 
begins with a decomposition of the 3-manifold into 
two handlebodies with common boundary X, and 
one studies a symplectic manifold s£ £, the configu- 
ration space of g-tuples of points on X, where 
g denotes the genus. The Heegaard Floer groups are 
then defined by a variant of the construction used 
for Lagrangian intersections (see the section “Morse 
theory and the Arnol'd conjecture"), applied to a 
particular pair of Lagrangian tori in ss». 

As in the case of Seiberg-Witten theory, Heegaard 
Floer homology assigns to each oriented 3-manifold 
Y three different Floer groups, HF* (Y), HF (Y), and 
HF™(Y), related by a long exact sequence: 


.— HF*(Y) HF (Y) = HF” (Y) 2 HF*(Y)— --- 


The first two groups are dual, in that there is 
a nondegenerate pairing between HF*(Y) and 
HF (—Y), where —Y denotes the same 3-manifold 
with opposite orientation. If W is an oriented four- 
dimensional cobordism from Yı to Y2, then there 
are associated functorial maps 


F*(W) : HF*(Y1)  HF' (Y2) 
F-(W) : HF (Yi) — HF (Y2) 
F” (W): HFE” (Y1) 5 HF™(Y2) 


In addition, if the intersection form of W is not 
negative semidefinite, there is a map 


F(W) : HF (Yi) — HF*(Y;) 


As a special case, one can start with a closed 
4-manifold X, and consider the cobordism W from 
S? to $? obtained from X by removing two 4-balls. 
In this case, the map 


F(W) : HF (S?) — HF+ (S?) 


encodes a diffeomorphism invariant of the original 
4-manifold X. This invariant is conjectured to be 
equivalent to the Seiberg-Witten invariants of X. 

Heegaard Floer homology, and its cousin Seiberg- 
Witten Floer homology, have been applied success- 
fully to settle long-standing problems in topology, 
particularly questions related to surgery on knots. 
An example of such an application is the theorem of 
Kronheimer e£ al. that one cannot obtain the 
projective space RP? by surgery on a nontrivial 
knot in the 3-sphere. 

In these and other applications of both Heegaard 
and Seiberg-Witten Floer homology, two key proper- 
ties of the homology groups play an important part. 
The first is a nonvanishing theorem, which shows, for 
example, that these Floer groups can distinguish S! x 
S? from any other manifold with the same homology. 
The second is a long exact sequence, which relates the 
Floer groups of the manifolds obtained by three 
different surgeries on a knot. The latter property is 
shared by the instanton Floer groups, as was shown by 
Floer (Braam and Donaldson 1995). 

Other Floer-type theories have been considered, 
not all of which arise from a gradient flow, but in 
which the boundary map of the complex is obtained 
by counting solutions to a geometric differential 
equation. At the time of writing, Floer homology is 
an area of very active development. 


See also: Four-Manifold Invariants and Physics; Gauge 
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The objective of this article is to give an overview 
of some advanced numerical methods commonly 
used in fluid mechanics. The focus is set primarily 
on finite-element methods and  finite-volume 
methods. 


Fluid Mechanics Models 


Let Q be a domain in R4(d = 2, 3) with boundary ôN 
and outer unit normal n. Q is assumed to be 
occupied by a fluid. The basic equations governing 
fluid flows are derived from three conservation 
principles: cozservation of mass, momentum, and 
energy. Denoting the density by p, the velocity by u, 
and the mass specific internal energy by e;, these 
equations are 


Op + V - (pu) — 0 [1] 


O,(pu) + V -(pu@u)=V-0+4 pf [2] 


O,(pe;) + V -(pue;))=0:€+qr—V-jfr |3] 


where 9G is the stress tensor, € = (1/2)(Vu + Vu)! is 
the strain tensor, f is a body force per unit mass 
(gravity is a typical example), qr is a volume source 
(it may model chemical reactions, Joule effects, 
radioactive decay, etc.), and j4 is the heat flux. In 
addition to the above three fundamental conserva- 
tion equations, one may also have to add L 
equations that account for the conservation of 


other quantities, say dy, 1 < 4 € L. These quantities 
may, for example, be the concentration of constitu- 
ents in an alloy, the turbulent kinetic energy, the 
mass fractions of various chemical species by unit 
volume, etc. All these conservation equations take 
the following form: 


(ppe) + V - (Pupe) = qo, — V -jon 


Henceforth, the index £ is dropped to alleviate the 
notation. 

The above set of equations must be supplemented 
with initial and boundary conditions. Typical initial 
conditions are ojo = po, 4t o — 49, and $9 = do. 
Boundary conditions are usually classified into 
two types: the essential boundary conditions and 
the natural boundary conditions. Natural conditions 
impose fluxes at the boundary. Typical examples are 


1«£«L [4 


(0 :n+ R. u)isg = au 

UT :天 十 7Teijlao = ar 
and 

Ug n+ ro9) an = d$ 


The quantities R, rr, 75, Au, AT, ao are given. Essen- 
tial boundary conditions consist of enforcing bound- 
ary values on the dependent variables. One typical 
example is the so-called zo-slip boundary condition: 
ulan = 0. 

The above system of conservation laws is closed 
by adding three constitutive equations whose pur- 
pose is to relate each field ø, jy, and j; to the fields 
p, u, and ó. They account for microscopic properties 
of the fluid and thus must be frame-independent. 
Depending on the constitutive equations and 
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adequate hypotheses on time and space scales, 
various models are obtained. An important class of 
fluid model is one for which the stress tensor is a 
linear function of the strain tensor, yielding the so- 
called Newtonian fluid model: 


o=(pt+AV-u)l+2pe [5] 


Here p is the pressure, I is the identity matrix, and A 
and / are viscosity coefficients. Still assuming 
linearity, common models for heat and solute fluxes 
consist of assuming 


fp—-5VI,  fp==DVọ [6] 


where T is the temperature. These are the so-called 
Fourier’s law and Fick’s law, respectively. 

Having introduced two new quantities, namely 
the pressure p and the temperature T, two new 
scalar relations are needed to close the system. These 
are the state equations. One admissible assumption 
consists of setting p— p(p, T). Another usual addi- 
tional hypothesis consists of assuming that the 
variations in the internal energy are proportional 
to those in the temperature, that is, ĝe; =cpOT. 

Let us now simplify the above models by 
assuming that p is constant. Then, mass conserva- 
tion implies that the flow is incompressible, that is, 
V-u=0. Let us further assume that neither A, p, 
nor p depend on e;. Then, upon abusing the 
notation and still denoting by p the ratio p/p, the 
above set of assumptions yields the so-called 
incompressible Navier-Stokes equations: 


V.u-0 [7] 
bu +u- Vu —v^u-- Vp —f [8] 


As a result, che mass and momentum conservation 
equations are independent of that of the energy and 
those of the solutes: 


pcp(0,T +u: VT) — - (VT) —2u& : €+ qr [9] 


Piha Yo -Żv .(DV$) = 74 i10] 


Another model allowing for a weak dependency. of 
p on the temperature, while still enforcing incom- 
pressibility, consists of setting p= po(1 — B(T — To)). 
If buoyancy effects induced by gravity are important, 
it is then possible to account for them by setting 
f = pog(1 — B(T — To)), where g is the gravitational 
acceleration, yielding the so-called Boussinesq model. 

Variations on these themes are numerous and a 
wide range of fluids can be modeled by using 
nonlinear constitutive laws and nonlinear state 
laws. For the purpose of numerical simulations, 


however, it is important to focus on simplified 
models. 


The Building Blocks 


From the above considerations we now extract a 
small set of elementary problems which constitute 
the building blocks of most numerical methods in 
fluid mechanics. 


Elliptic Equations 


By taking the divergence of the momentum equation 
[8] and assuming u to be known and renaming p to 
o, one obtains the Poisson equation 


—Ag=f [11] 


where f is a given source term. This equation plays a 
key role in the computation of the pressure when 
solving the Navier-Stokes equations; see [54b]. 
Assuming that adequate boundary conditions are 
enforced, this model equation is the prototype for 
the class of the so-called elliptic equations. A simple 
generalization of the Poisson equation consists of the 
advection-diffusion equation 


u-Vó— V-(kV$) =f [12] 


where K O0. Admissible boundary conditions are 
(KOnd + ró)jgo =a, r 2 0, or Gan=a. This type of 
equation is obtained by neglecting the time deriva- 
tive in the heat equation [9] or in the solute 
conservation equation [10]. Mathematically speak- 
ing, [12] is also elliptic since its properties (in 
particular, the way the boundary conditions must 
be enforced) are controlled by the second-order 
derivatives. For the sake of simplicity, assume that 
4 — 0 in the above equation and that the boundary 
condition is $an = 0, then it is possible to show that 
@ solves [12] if and only if @ minimizes the 
functional 


TW) = f (VYP — fo) dx 
where |- | is the Euclidean norm and p spans 
H = L | Val? dx < oo; pan = T [13] 
JQ 


Writing the first-order optimality condition for this 
optimization problem yields 


[ Vb: V = J fi 


for all v eH. This is the so-called variational 
formulation of [12]. When z is not zero, no 
variational principle holds but a similar way to 


reformulate [12] consists of multiplying the equation 
by arbitrary functions in H and integrating by parts 
the second-order term to give 


] (voe svo v= [ fu, vweH [14] 
Q Q 


This is the so-called weak formulation of [12]. Weak 
and variational formulations are the starting point 
for finite-element approximations. 


Stokes Equations 


Another elementary building block is deduced from 
[8] by assuming that the time derivative and the 
nonlinear term are both small. The corresponding 
model is the so-called Stokes equations, 


—vAu + Vp =f [15] 


V-u=0 [16] 


Assume for the sake of simplicity that the no-slip 
boundary condition is enforced: ua0 — 0. Introduce 
the Lagrangian functional 


£L (v, q) = | (Vu:Vv - av -v—f-v) dx 


Set 


A= g |Vv|* dx < %00; Vn = o} 
Q 


M= dai | d dx e oc] 
Q 


Then, the pair (w,p)€ Xx M solves the Stokes 
equations if and only if it is a saddle point of £, that is, 


L(u,q) € £(wp) < £(v,p, V(v,qg)e XxM [17] 


In other words, the pressure p is the Lagrange multi- 
plier of the incompressibility constraint V - 4— 0. 
Realizing this fact helps to understand the nature of 
the Stokes equations, specially when it comes to 
constructing discrete approximations. A variational 
formulation of the Stokes equations is obtained by 
writing the first-order optimality condition, namely: 


] (Ni ve — pv v — f v)dx — 0 Wwex 
Q 


f A dx o Yq E€ M 
Q 


When the nonlinear term is not zero in the 
momentum equation, or when this term is linear- 
ized, there is no saddle point, but a weak formula- 
tion is obtained by multiplying the momentum 
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equation by arbitrary functions v in X and integrat- 
ing by parts the Laplacian, and by multiplying the 
mass equation by arbitrary functions q in M: 


[vi Ns -pN v) dx = | fov [18] 
Q Q 


[ave =0 [19] 


Parabolic Equations 


The class of elliptic equations generalizes to that of 
the parabolic equations when time is accounted for: 


3gp +u: Vó— V-(kVà) 2f, o=% [20] 


Fundamentally, this equation has many similarities 
with the elliptic equation 


ad --u- Vó — V - (kV) =f [21] 


where a > 0. In particular, the set of boundary 
conditions that are admissible for [20] and [21] are 
identical, that is, it is legitimate to enforce (KO, + 
rojon — 4,r > 0, or dag =a. Moreover, solving [21] 
is always a building block of any algorithm solving 
[20]. The important fact to remember here is that if 
a good approximation technique for solving [21] is 
at hand, then extending it to solve [20] is usually 
straightforward. 


Hyperbolic Equations 


When &/UL — 0, where U is the reference velocity 
scale and L is the reference length scale, [20] 
degenerates into the so-called transport equation 


Qc u-Vóo-—f [22] 


This is the prototypical example for the class of 
hyperbolic equations. For this equation to be well- 
posed, it is necessary to enforce an initial condition 
p10= po and an inflow boundary condition, that 
is, jan- =a, where OQ” = [x € ƏN; (u - n)(x) < 0} is 
the so-called inflow boundary of the domain. To 
better understand the nature of this equation, 
introduce the characteristic lines X(x,s;t) of u(x, t) 
defined as follows: 


d;X(x,s;t) = u(X(x,s;t),t) 


[23] 
AlX 535) =x 


If u is continuous with respect to t and Lipschitz 
with respect to x, this ordinary differential equation 
has a unique solution. Furthermore, [22] becomes 


d, [p(X (x, s;t),t)] = f(X (x, s; t), t) [24] 
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Then 


(x, t) = go(X(x, t;0)) + / pe n de 


provided X(x,t;7) € Q for all 7 € [0,2]. This shows 
that the concept of characteristic curves is important 
to construct an approximation to [22]. 


Meshes 


The starting point of every approximation technique 
for solving any of the above model problems consists 
of defining a mesh of 2 on which the approximate 
solution is defined. To avoid having to account for 
curved boundaries, let us assume that the domain €) is 
a two-dimensional polygon (resp. three-dimensional 
polyhedron). A mesh of Q, say T}, is a partition of Q 
into small cells, hereafter assumed to be simple 
convex polygons in two dimensions (resp. polyhe- 
drons in three dimensions), say triangles or quad- 
rangles (resp. tetrahedrons or cuboids). Moreover, 
this partition is usually assumed to be such that if 
two different cells have a nonempty intersection, then 
the intersection is a vertex, or an entire edge, or an 
entire face. The left panel of Figure 1 shows a mesh 
satisfying the above requirement. The mesh in the 
right panel is not admissible. 


Finite Elements: Interpolation 


The finite-element method is foremost an interpola- 
tion technique. The goal of this section is to 
illustrate this idea by giving examples. 

Let 7, ={Km}i<m<n, be a mesh composed of Na 
simplices, that is, triangles in two dimensions or 
tetrahedrons in three dimensions. Consider the 
following vector spaces of functions: 


V, = (v, € C (Q); vnr, € P,1 € m Na} [25] 


where P, denotes the space of polynomials of global 
degree at most k. V, is called a finite-element 
approximation space. We now construct a basis for V}. 

Given a simplex K,, in R4, let v, be a vertex of 
Km, let F, be the face of K,, opposite to v,, and 


Figure 1 Admissible (left) and nonadmissible (right) meshes. 


define n, to be the outward normal to F,,1 <n < 
d+ 1. Define the barycentric coordinates 
(X Ema Un ) "Ny 


An(x) = 1 — 2——— ——, 


erin rem l<n<d+1 [26] 
| Cn) "n 


where v, is an arbitrary vertex in F,, (the definition 
of A, is clearly independent of v; provided v; belongs 
to F,). The barycentric coordinate A, is an affine 
function; it is equal to 1 at v, and vanishes on F,; its 
level sets are hyperplanes parallel to F,. The 
barycenter of K,, has barycentric coordinates 


i t 
ET a T 


The barycentric. coordinates satisfy the following 
properties: for all x € Km,0 < A,(x) < 1, and for all 
x € R4, 


d4-1 d+1 
>》 An(x) =1 and bD An(x)(x — v4) = 0 
n=l n=1 


Consider the set of nodes [2,,,];-,-,, Of Km with 
barycentric coordinates 


(28), 1EAN BLE be ow thal 
These points are called the Lagrange nodes of Km. It 
is clear that there are z,— (1/2)(k 4- 1)(k +2) 
of these points in two dimensions and n,, — (1/6) 
(kR+1)(kR+2)(k+3) in three dimensions. It is 
remarkable that n = dim P}. 

Let (bi, ee kis by} = Uke, (dins Ses Anam) be the 
set of all the Lagrange nodes in the mesh. For K,,, € T, 
and n€ (1,...,744], let j(m,m) € (1,..., N] be the 
integer such that 4, m= bj, ,);j(n, m) is the global 
index of the Lagrange node ay, m. Let {y1,..., PN} be 
the set of functions in V, defined by v;(b;) = 6;, then it 
can be shown that 


[q1,-.., Yn} is a basis for Vj, [27] 


The functions y; are called global shape functions. 
An important property of global shape functions is 
that their supports are small sets of cells. More 
precisely, let 2€{1,...,N} and let V;={m;4n; 
i=j(n,m)} be the set of cell indices to which the 
node b; belongs, then the support of o; is Unen Kn- 
For k— 1, it is clear that y;\x, — A, for all m € Vj 
and all n such that ;—j(m,m), and Pilk, =0 
otherwise. The graph of such a shape function in 
two dimensions is shown in the left panel of Figure 2. 
For k=2, enumerate from 1 to d+ 1 the vertices of 
Km, and enumerate from d+2 to ny, the Lagrange 
nodes located at the midedges. For a midedge node 
of index d+2<n<ng, let b(n),e(n) € (1,..., 
d--1) be the two indices of the two Lagrange 
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Figure 2 Two-dimensional Lagrange shape functions: piecewise Pi (left) and piecewise P» (center and right). 


nodes at the extremities of the edge in question. Then, 
the restriction to K,, of a P5 shape function dg; is 


00 fM. - 1), 
PilKm = 4A b(n)Ae(n); 


ifl<n<d+1 8 
ifd+2<n<nyg, | | 


Figure 2 shows the graph of two P5 shape functions 
in two dimensions. 

Once the space V, is introduced, it is natural to 
define the interpolation operator 


N 
IIj,:C(0)3v— > 'w(b)eie V, [29] 


i=] 


This operator is such that for all continuous 
functions v, the restriction of IIj(v) to each mesh 
cell is a polynomial in P, and IIj(v) takes the same 
values as v at the Lagrange nodes. Moreover, setting 
h= maxx, cr, diam(K,,), and defining 


1/p 
rl, = ( f inde) for <p < oe 
0 


the following approximation holds: 


|v-— IH Qv)||j» + b]| V(v — Op (v)) Ihre 
< chk [viia 30 


where c is a constant that depends on the quality of 
the mesh. More precisely, for Km € Tp, let px, be 
the diameter of the largest ball that can be inscribed 
into Km and let bx, be the diameter of Km. Then, c 
depends on o = maxx,cr, Pk, /OK Hence, for the 
mesh to have good interpolation properties, it is 
recommended that the cells be not too flat. Families 
of meshes for which o is bounded uniformly with 
respect to h as h — 0 are said to be shape-regular 
families. 

The above example of finite-element approxima- 
tion space generalizes easily to meshes composed of 
quadrangles or cuboids. In this case, the shape 
functions are piecewise polynomials of partial 


degree at most k. These spaces are usually referred 
to as Q, approximation spaces. 


Finite Elements: Approximation 


We show in this section how finite-element approx- 
imation spaces can be used to approximate some 
model problems exhibited in the section “Building 
blocks." 


Advection-Diffusion 


Consider the model problem [21] supplemented 
with the boundary condition (K0,¢ + ró)ioo — g- 
Assume « > 0,a + (1/2)V -u > 0, and r > 0. Define 


até, 4) = i (ab + B- Ob &Vó- Vi)dx 


+ f rowds 
an 


Then, the weak formulation of [21] is: seek ó€ H 
(H defined in [13]) such that for all y € H 


a(5,9) = 人 fide + J E 31] 


Using the approximation space V, defined in [25] 
together with the basis defined in [27], we seek an 
approximate solution to the above problem in the 
form %, = ee U;p; € Vy. Then, a simple way of 
approximating [31] consists of seeking U= 
b ian Un)! € RN such that for all 1 <i<N 


ae) [fede | gods — (32) 


This problem finally amounts to solving the follow- 
ing linear system: 


AU =F [33] 
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where Aj =a(pj, pi) and 


B= | fedx | gyids 
Q Jao 


The above approximation technique is usually 
referred to as the Galerkin method. The following 
error estimate can be proved: 


ló — rll + PIIV(Ó — br) 
< gp Pll cet oy [34] 


where, in addition to depending on the shape 
regularity of the mesh, the constant c also depends 
on K, a, and f. 


Stokes Equations 


The line of thought developed above can be used to 
approximate the Navier-Stokes problem [15]-[16]. 
Let us assume that the nonlinear term s: Vu is 
linearized in the form v - Vu, where v is known. Let 
T, be a mesh of Q, and assume that finite-element 
approximation spaces have been constructed to 
approximate the velocity and the pressure, say Xj 
and M,. Assume for the sake of simplicity that X; C 
X and M, C M. Assume that bases for Xj and M, 
are at hand, say {y1,.-.,¢n,} and {vi,...,Unj) 
respectively. Set 


a(u, 9) = f (v: Vu) .Ø +vVVu : Vodx 
Jo 
and 
b(v, v) = -| WY :vdx 
Q 


Then, we seek an approximate velocity 4; — 
y U;Q; and an approximate pressure p,= 
D. Pp, such that for all i € (1,..., N,] and all 
k € (1,..., Np} the following holds: 


alun Pi) + blh pb) = [ f-odx [35] 


b(uy, V.) = 0 [36] 


Define the matrix AéeRNN such that 
Ajj —4(9;,9;). Define the matrix B € RX N« such 
that By; = b(g;, W). Then, the above problem can be 
recast into the following partitioned linear system: 


A Bl'|[U] [|F 
sollel-l] 
where the vector F € R™ is such that 已 = fof- Q; 


An important aspect of the above approximation 
technique is that, for the linear system to be 


x 
VAVAVAVAVA 


Figure 3 The P;/P, finite element: the mesh (left); one 
pressure spurious mode (right). 


invertible, the matrix B! must have full row rank 
(i.e., B has full column rank). This amounts to 


- Uy dx 
inf sup Jo UV -vn >p, [38] 


38, > 0 LT o 
^ 4M, vex, lvplxllasllm 


where 


jesl = [IN dx, lanl = f aids 
Q Q 

This nontrivial condition is called the Ladyzenskaja- 
Babu$ka-Brezzi condition (LBB) in the literature. 
For instance, if P, finite elements are used to approx- 
imate both the velocity and the pressure, the above 
condition does not hold, since there are nonzero 
pressure fields q, in M, such that fo q; V - v;dx — 0 
for all vj; in X,. Such fields are called spurious 
pressure modes. An example is shown in Figure 3. 
The spurious function alternatively takes the values 
—1,0, and - at the vertices of the mesh so that its 
mean value on each cell is zero. 

Couples of finite-element spaces satisfying the 
LBB condition are numerous. For instance, assuming 
k > 2, using P, finite elements to approximate the 
velocity and P,_; finite elements to approximate the 
pressure is acceptable. Likewise, using Q; elements 
for the velocity and OQ, , elements for the pressure 
on meshes composed of quadrangles or cuboids is 
admissible. 

Approximation techniques for which the pressure 
and the velocity .degrees of freedom are not 
associated with the same nodes are usually called 
staggered approximations. Staggering pressure and 
velocity unknowns is common in solution methods 
for the incompressible Stokes and Navier-Stokes 
equations; see also the subsection “Stokes 
equations." 


Finite Volumes: Principles 


The finite-volume method is an approximation 
technique whose primary goal is to approximate 
conservation equations, whether time dependent or 
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not. Given a mesh, say 7, —|[K,]i-,-N, and a 


conservation equation 
ado + V -F(¢, Vo, x,t) = f [39] 


(a — 0 if the problem is time independent and a= 1 
otherwise), the main idea underlying every finite- 
volume method is to represent the approximate 
solution by its mean values over the mesh cells 
(OR 9+ - DR a € Rò! and to test the conservation 
equation by the characteristic functions of the mesh 
cells {1x,,-. , 1x]. For each cell Km € Tp, denote by 
ng, the mead unit normal vector and denote by Fm 

the set of the faces of Km. The finite-volume approx- 


imation to [39] consists of seeking (óx,,.. 59k) € 
RNI such that the function Quom D> erga] PK Kim 
satisfies the following: for all 1 < m < Na 

[Km|ad:dx, (t) T Hi F; (Op, VyÓp.t) = | fs [40] 


OEF m 


IK, = | a 
K 


V,ójy is an approximation of Vó, and F, " is an 
approximation of 


where 


| Fe. Vó,x,t): ny, do 


The precise definition of the so-called approximate 
flux F;^" depends on the nature of the problem 
(e.g., elliptic, parabolic, hyperbolic, saddle point) 
and the desired accuracy. In general, the approx- 
imate fluxes are required to satisfy the following 
two important properties: 


1. Conservativity: for Km. Ki € T, such that 
G—K.4n0K,F,*- Pe: 


2. Consistency: i wy be the solution to [39], and set 


g2 Kn. 


pdx + -- 
Miro K 


wdx 


"7 
Kı 


then 
Fr (Wh, Viens t) = [ Fev -ndo ash — 0 


The quantity 


Fr (dy, View, t) — | Fw. Vy, x,t) - ndo 


is called the consistency error. 


Note that [40] is a system of ordinary differential 
equations. This system is usually discretized in time 
by using standard time-marching techniques such as 
explicit Euler, Runge-Kutta, etc. 


The discretization technique described above is 
sometimes referred to as cell-centered finite-volume 
method. Another method, called vertex-centered 
finite volume method, consists of using the char- 
acteristic functions associated with the vertices of 
the mesh instead of those associated with the cells. 


Finite Volumes: Examples 


In.this section we illustrate the ideas introduced 
above. Three examples are developed: the Poisson 
equation, the transport equation, and the Stokes 
equations. 


Poisson Problem 


Consider the Poisson equation [11] equipped with 
the boundary condition 9,ój5o —4. To avoid techni- 
cal details, assume that Q= [0, 1]. Let Kj, be a mesh 
of Q composed of rectangles (or cuboids in three 
dimensions). 

The flux function is F(ó, Vó, x) - — Vó; hence, 
F,;" must be a consistent conservative approxima- 
tion of 一 | nk, - Vó do. Let o be an interior face of 
the mesh and let Km, Kı be the two cells such that 
c — Kj Kj. Let xg, , xk, be the barycenters of K,, 
and K;, respectively. Then, an admissible formula 
for the approximate flux is 


Fo — |o | ox, 一 
á IXK,, n j 


ÓK,,) [41] 


where |o|= f do. The consistency error is O(h) in 
general, and is O(h?) if the mesh is composed of 
identical cuboids. The conservativity is evident. If o 
is part of OQ, an admissible formula for the 
approximate flux is Fj” =— Jo; ado. Then, upon 
defining Fy, =F QVO. and F? =Fx, nag, the 
Bnite-volame approximation of tic Poisson problem 
is: seek 如 € R^! such that for all 1 < m < Ne 


» Um J fax+ X faao [42] 


CEF kp CEFK 
Transport Equation 
Consider the transport equation 
nb + V+ (ud) =f [43] 
Pr-0 = 90, Plan =a [44] 


where u(x, t) is a given field in C'(Q x [0, T]). Let T, 
be a mesh of Q. For the sake of simplicity, let us use 
the explicit Euler time-stepping to approximate [40]. 
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Let N be positive integer, set At = T/N, set t" =nAt 
for 0 € n € N, and partition [0, T] as follows: 


N-1 
- U ie”, PU 
n=0 


Denote by $7 € R™« the finite-volume approxima- 


(0, T 


tion of ¢,(t"). Then, [40] is approximated as 
follows: 
ed n+1 n m,c n 
zb, — k,) EE » F, (dp, Vb p. t ) 
OEF m 
= | fe. mds 45] 
K 


where dx = Ic, oo dx. The approximate flux F7" 
must be a consistent conservative approximation nu 
[, (u-nx,,)@do. Let o be a face of the mesh and let 
Km, Kı be the two cells such that o — K,, N K; (note 


that if o is on NQ, o belongs to one cell only and we 
set Km = Kj). If o is on OQ’, set 


pne — J w mg ade (46] 


If c is not on OD, set uj, 


define 
n n 
pne | OK, Um 
bh Ym ， 
KYU mo if Ung «O0 


c= |, (u-nx,,)do and 


if ur, 20 
[47] 


The above choice for the approximate flux is usually 
called the upwind flux. It is consistent with the analysis 
that has been done for [22], that is, information flows 
along the characteristic lines of the field u; see [24]. In 
other words, the updating of ont must be done by 
using the approximate values ¢/ coming from the cells 
that are upstream the flow field. 

An important feature of the above approximation 
technique is that it is L°°-stable, in the sense that 


Ok, | € (uo, f) 


MaxXQ<n<N,1<m<N,) 


if the two mesh parameters At and Ph satisfy 
the so-called Courant—Friedrichs—Levy (CFL) 
condition |lullj~At/b < clo), where c(o) is a con- 
stant that depends on the mesh regularity parameter 
c — maxx,er, Pk, / pk,,. In one dimension, c(a)= 1. 


Stokes Equations 


To finish this short review of finite-volume methods, 
we turn our attention to the Stokes problem (15)-(16) 
equipped with the homogeneous Dirichlet boundary 
condition jan = 0. 


Let T be a mesh of Q composed of triangles (or 
tetrahedrons). All the angles in the triangulation are 
assumed to be acute so that, for all K € 7,, the 
intersection of the orthogonal bisectors of the sides 
of K, say xx, is in K. We propose a finite-volume 
approximation for the velocity and a finite-element 
approximation for the . pressure. Let {e1,..., e4} bea 
Cartesian basis for R^. Set n = Le ey for all 1 < 
m < Na and 1 < k € d; then define 


1 d 
nc 


Let {bi,...,bn,} be the vertices of the mesh, and let 
(Q1,...,qN,). be the associated piecewise linear 
global shape functions. Then, set (see the section 
*Finite elements: interpolation") 


Ay = spand 1k- 14, 


, DN, } 
My = (a € Ny; | qdx — 0) 


N, = span{y},... 


The approximate ， problem consists of seeking 
(UK,,-.-»UKy_) € ReNu and p, € M; such that for 
all 1« m « NaI « k « d, and all 1 € i € Ny 


iJ 1k FLU c(1kn Ph) 


OEF m 


3 / 1k .fdx [48] 
K,, 


c(ux,,, Pi) — 0 [49] 
where 
x, pi) = | ve Vendy 
Moreover, 
ad (uk, — UK, ) f= Ka TK 
us lt, 一 xj| 
tol E" 
vio 
if o= Ka noo 
d(%m,0) UK,, Ic NO 


where d(xy,,c) is the Euclidean distance between 
xx, and o. This formulation yields a linear system 
with the same structure as in [37]. Note in particular 
that 


C\Up, 
sup Ee ll Vj lr: [50] 
v,cX, v L® 


Since the mean value of p, is zero, ||Vpp]|;ı is a norm 
on M,. As a result, an inequality similar to [38] holds. 
This inequality is a key step to proving that the linear 
system is wellposed and the approximate solution 
converges to the exact solution of (15)-(16). 


Projection Methods for Navier-Stokes 


In this section we focus on the time approximation 
of the Navier-Stokes problem: 


Ou — vAu --u- Vu 4 Vp —f [51a] 
V-u=0 [51b] 
ujan = 0 [51c] 
Ujt=0 = Uo [51d] 


where f is a body force and uo is a solenoidal 
velocity field. There are numerous ways to discretize 
this problem in time, but, undoubtedly, one of the 
most popular strategies is to use projection methods, 
sometimes also referred to as Chorin-Temam 
methods. 

A projection method is a fractional-step time- 
marching technique. It is a predictor-corrector 
strategy aiming at uncoupling viscous diffusion and 
incompressibility effects. One time step is composed 
of three substeps: in the first substep, the pressure is 
made explicit and a provisional velocity field is 
computed using the momentum equation; in the 
second substep, the provisional velocity field is 
projected onto the space of incompressible (solenoi- 
dal) vector fields; in the third substep, the pressure is 
updated. 

Let q > 0 be an integer and approximate the time 
derivative of u using a backward difference formula of 
order q. To this end, introduce a positive integer N, set 
At — T/N, set t" =nAt for 0 € n € N, and consider a 
partitioning of the time interval in the form 


N-1 
(0, T] = U ie", pe 
n=0 
For all sequences va; = (v9, v!, ... , v), set 
q—1 
prn = pat! -Y garni [52] 
j=0 


where q- 1 En € N— 1. The coefficients 8; are 
such that 


1 = l 
ne (B,u(t"**) 一 3 Bju(t" )) 


is a qth-order backward difference formula approx- 
imating O,u(t"*!). For instance, 
pO, - yt! 时 T 
2),9H1 _ 3,7141 = 
Gyr Lyn _ 2u" + ly" 
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Jas d s 


Furthermore, for all sequences da; = (9, ó!, . 


define 


Quer -S er [53] 


j=0 


so that m NA (£"77) is a (q — 1)th-order extrapola- 
tion of P ) For instance, p^*"*!—0 for 
q= 1 p^ n+1 =p" for q=2, and poets =2p" „~ pe 
for q—3. Finally, denote by (u: Vu)^"*! a qth- 
order extrapolation of (u - Vu)(t"*!). For instance, 


«n+l u” Vu” 
(u: Vu) ut^ Vu" — y”! Vuy"! 


forg=1 
itg 2 


A general projection algorithm is as follows. Set 
=u and $=0 for O«I«q-1. If 4» 1; 
assume that 4!,...,424 !, p^? and (u- Vu)^? have 
been initialized properly. For n > q — 1, seek &"*! 
such that Wa = — 0 and 


pe +1 +1 +1 - B; | 
“øl -H xn La ey 
A; VAW +VIp + Dar? 


= grt [54a] 


where $"*! — f(t^*!) — (u - Vu)^"*!. Then solve 


Agt LV.w", gp = 0 [54b] 
Finally, update the pressure as follows: 
prt! = ba "a puer. VV - du [54c] 
At 


The algorithm [54a-c] is known in the literature as 
the rotational form of the pressure-correction 
method. Upon denoting uA, — (u(19),. .. , u(t")) 
and pa4,—(p(t?),...,p(t")), the above algorithm 
has been proved to yield the following error 
estimates: 


IZ = ANTES £ cAt? 


|V (4a = UAs)|| (12) 3$ IP A: = DN deua) < cAt?/* 
where | Batlle) — At 5,9 Ja lo dx. 

A simple strategy to initialize the algorithm 
consists of using Du! at the first step in [54a]; 
then using D?&? at the second step, and proceed- 
ing likewise until Z',...,27^! have all been 
computed. 

At the present time, projection methods count among 
the few methods that are capable of solving the time- 
dependent incompressible Navier-Stokes equations in 
three dimensions on fine meshes within reasonable 
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computation times. The reason for this success is that 
the unsplit strategy, which consists of solving 


pa) 


A: yn"! u vAu"*! ib. Ve" = gn 


[55a] 


V.u"5—0, we =0 [55b] 


yields a linear system similar to [37], which usually takes 
far more time to solve than sequentially solving [54a] 
and [54b]. It is commonly reported in the literature that 
the ratio of the CPU time for solving [55a]-[55b] to that 
for solving [54a-c] ranges between 10 to 30. 


See also: Compressible Flows: Mathematical Theory; 
Computational Methods in General Relativity: The Theory; 
Geophysical Dynamics; Image Processing: Mathematics; 
Incompressible Euler Equations: Mathematical Theory; 
Interfaces and Multicomponent Fluids; 
Magnetohydrodynamics; Newtonian Fluids and 
Thermohydraulics; Non-Newtonian Fluids; Partial 
Differential Equations: Some Examples; Variational 
Methods in Turbulence. 
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Introduction 


In the famous 1822 treatise by Jean Baptiste Joseph 
Fourier, Théorie analytique de la chaleur, the Discours 
préliminaire opens with: “Primary causes are 
unknown to us; but are subject to simple and constant 
laws, which may be discovered by observation, the 
study of them being the subject of natural philosophy. 
Heat, like gravity, penetrates every substance of the 
universe, its rays occupy all parts of space. The object 
of our work is to set forth the mathematical laws 
which this element obeys. The theory of heat will 
hereafter form one of the most important branches of 
general physics.” After a brief discussion of rational 
mechanics, he continues with the sentence: “But 
whatever may be the range of mechanical theories, 
they do not apply to the effects of heat. These make up 
a special order of phenomena, which cannot be 
explained by the principles of motion and equilibria.” 
Fourier goes on with a thorough description of the 
phenomenology of heat transport and the derivation of 
the partial differential equation describing heat trans- 
port: the heat equation. A large part of the treatise is 
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then devoted to solving the heat equation for various 
geometries and boundary conditions. Fourier's treatise 
marks the birth of Fourier analysis. After Boltzmann, 
Gibbs, and Maxwell and the invention of statistical 
mechanics in the decades after Fourier's work, we 
believe that Fourier was wrong and that, in principle, 
heat transport can and should be explained *by the 
principles of motion and equilibria," that is, within the 
formalism of statistical mechanics. But well over a 
century after the foundations of statistical mechanics 
were laid down, we still lack a mathematically 
reasonable derivation of Fourier's law from first 
principles. Fourier's law describes the macroscopic 
transport properties of heat, that is, energy, in none- 
quilibrium systems. Similar laws are valid for the 
transport of other locally conserved quantities, for 
example, charge, particle density, momentum, etc. We 
will not discuss these. laws here, except to point out 
that in none of these cases macroscopic transport laws 
have been derived from microscopic dynamics. As 
Peierls once put it: “It seems there is no problem in 
modern physics for which there are on record as many 
false starts, and as many theories which overlook some 
essential feature, as in the problem of the thermal 
conductivity of [electrically] non-conducting crystals." 


Macroscopic Law 


Consider a macroscopic system characterized at 
some initial time, say t=0, by a nonuniform 
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temperature profile To(r). This temperature profile 
will generate a heat, that is, energy current J(r). 


Due to energy conservation and basic 
thermodynamics: 
a (T) T(r.) - -V -J [t 


where c,(T) is the specific heat per unit volume. On the 
other hand, we know that if the temperature profile is 
uniform, that is, if To(r) = To, there is no current in 
the system. It is then natural to assume that, for small 
temperature gradients, the current is given by 


J(r) = —K(T(r))VT(r) n 


where «(T) is the conductivity. Here we have 
assumed that there is no mass flow or other mode 
of energy transport besides heat conduction (we 
also ignore, for simplicity, any variations in density 
or pressure). Equation [2] is normally called as 
Fourier's law. Putting together eqns [1] and [2], we 
get the heat equation: 


e (T) T(r) = V -[s(T)VT] [3] 


This equation must be completed with suitable 
boundary conditions. Let us consider two distinct 
situations in which the heat equation is observed to 
hold experimentally with high precision: 


1. An isolated macroscopic system, for example, 
a fluid or solid in a domain A surrounded 
by effectively adiabatic walls. In this case, 
eqn [3] is to be solved subject to the initial 
condition T(r,O) — To(r) and no heat flux 
across the boundary of A (denoted by OA), that 
is, n(r) VI (r) 20 if r € OA with n the normal 
vector to OA at r. As t — oo, the system reaches a 
stationary state characterized by a uniform 
temperature T determined by the constancy of 
the total energy. 

2. A system in contact with heat reservoirs. Each 
reservoir a fixes the temperature of some portion 
(OA), of the boundary OA. The rest of the 
boundary is insulated. When the system reaches 
a stationary state (again assuming no matter 
flow), its temperature will be given by the 
solution of eqn [3] with the left-hand side set 
equal to zero, 


V.J(r) 2 V-(sVT(r)) =0 (4) 


subject to the boundary condition T(r)=T, for 
r€ (OA), and no flux across the rest of the 
boundary. 


The simplest geometry for a conducting system is 
that of a cylindrical slab of height h and cross- 


sectional area A. It can be either a cylindrical 
container filled with a fluid or a piece of crystalline 
solid. In both cases, one keeps the lateral surface of 
the cylinder insulated. If the top and the bottom of 
the cylinder are also insulated we are in case (1). If 
one keeps the top and the bottom in contact with 
thermostats at temperatures Th and Tp, respectively, 
this is (for a fluid) the usual setup for a Benard 
experiment. To avoid convection, one has to make 
Ty > Ty or keep |Th — Ty| small. Assuming unifor- 
mity in the direction perpendicular to the vertical 
x-axis one has, in the stationary state, a tempera- 
ture profile T(x) with T(0)— Ty, T(P) - Ty, and 

«(T)dT /dx = const. for x € (0, 5). 

In deriving the heat equation, we have implicitly 
assumed that the system is described fully by specifying 
its temperature T(r,t) everywhere in A. What this 
means on the microscopic level is that we imagine the 
system to be in local thermal equilibrium (LTE). 
Heuristically, we might think of the system as being 
divided up (mentally) into many little cubes, each large 
enough to contain very many atoms yet small enough 
on the macroscopic scale to be accurately described, at 
a specified time 7, as a system in equilibrium at 
temperature T(r;, t), where r; is the center of the ith 
cube. For slow variation in space and time, we can 
then use a continuous description T(r,£). The theory 
of the heat equation is very developed and, together 
with its generalizations, plays a central role in modern 
analysis. In particular, one can consider more general 
boundary conditions. Here we are interested in the 
derivation of eqn [2] from first principles. This clearly 
presupposes, as a first fundamental step, a precise 
definition of the concept of LTE and its justification 
within the law of mechanics. 


Empirical Argument 


A theory of heat conduction has as a goal the 
computation of the conductivity «(T) for realistic 
models, or, at the very least, the derivation of 
behavior of &(T) as a function of T. The early 
analysis was based on “kinetic theory." Its applica- 
tion to heat conduction goes back to the works of 
Clausius, Maxwell, and Boltzmann, who obtained a 
theoretical expression for the heat conductivity of 
gases, & ~ VT, independent of the gas density. This 
agrees with experiment (when the density is not too 
high) and was a major early achievement of the 
atomic theory of matter. 


Heat Conduction in Gases 


Clausius and Maxwell used the concept of a *mean 
free path" à: the average distance a particle (atom or 
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molecule) travels between collisions in a gas with 
particle density p. Straightforward analysis gives 
à ~ l/pnc? , where c an “effective” hard-core diameter 
of a particle. They considered a gas with temperature 
gradient in the x-direction and assumed that the gas is 
(approximately) in local equilibrium with density p 
and temperature T(x). Between collisions, a particle 
moves a distance A carrying a kinetic energy propor- 
tional to T(x) from x to x +A/V3, while in the 
opposite direction the amount carried is proportional 
to T(x + AV/3). Taking into account the fact that the 
speed is proportional to VT the amount of energy J 
transported per unit area and time across a plane 
perpendicular to the x-axis is approximately 


Tn pVT|T(x) — T(x 4- AV3) 
m asd 
os ma OE [5] 


and so & ~ VT independent of p, in agreement with 
experiment. It was clear to the founding fathers that 
starting with a local equilibrium situation the process 
described above will produce, as time goes on, a 
deviation from LTE. They reasoned, however, that this 
deviation from local equilibrium will be small when 
(A/ T)dT /dx « 1, the regime in which Fourier's law is 
expected to hold, and the above calculation should 
yield, up to some factor of order unity, the right heat 
conductivity. To have a more precise theory, one can 
describe the state of the gas through the probability 
distribution f(r,p,t) of finding a particle in the 
volume element dr dp around the phase space point 
(r,p). Here LTE means that 


p 
f (r,p,t) = exp (- vene] 


where m is the mass of the particles. If one computes 
the heat flux at a point r by averaging the microscopic 
energy current at r, j = pv(1/2mv"), over f(r, p, t) then 
it is only the deviation from local equilibrium which 
makes a contribution. The result however is essentially 
the same as eqn [5]. This was shown by Boltzmann, 
who derived an accurate formula for « in gases by 
using the Boltzmann equation. If one takes A from 
experiment, the above analysis yields a value for cy the 
effective size of an atom or molecule, which turns out 
to be close to other determinations of the characteristic 
size of an atom. This gave an evidence for the reality of 
atoms and the molecular theory of heat. 


Heat Conduction in Insulating Crystals 


In (electrically) conducting solids, heat is mainly 
transported by the conduction electron. In this case, 
one can adapt the theory discussed in the previous 


section. In (electrically) insulating solids, on the other 
hand, heat is transmitted through the vibrations of the 
lattice. In order to use the concepts of kinetic theory, it 
is useful to picture a solid as a gas of phonons which 
can store and transmit heat. A perfectly harmonic 
crystal, due to the fact that phonons do not interact, 
has an infinite thermal conductivity: in the language of 
kinetic theory, the mean free path A is infinite. In a real 
crystal, the anharmonic forces produce interactions 
between the phonons and therefore a finite mean free 
path. Another source of finite thermal conductivity 
may be the lattice imperfections and impurities which 
scatter the phonons. Debye devised a kind of kinetic 
theory for phonons in order to describe thermal 
conductivity. One assumes that a small gradient of 
temperature is imposed and that the collisions between 
phonons maintain local equilibrium. An elementary 
argument gives a thermal conductivity analogous to 
eqn [5] obtained in the last subsection for gases 
(remembering, however, that the density of phonons 
is itself a function of T) 


K e cyc^T [3 


where, with respect to eqn [5], p has been replaced by 
cy, the specific heat of phonons, VT by c, the (mean) 
velocity of the phonons, and A by cr, where 7 is the 
effective mean free time between phonon collisions. 
The thermal conductivity depends on the temperature 
via T, and a more refined theory is needed to account 
for this dependence. This was done by Peierls via a 
Boltzmann equation for the phonons. In collisions 
among phonons, the momentum of phonons is 
conserved only modulo a vector of the reciprocal 
lattice. One calls *normal processes" those where the 
phonon momentum is conserved and *Umklap pro- 
cesses" those where the initial and final momenta 
differ by a nonzero reciprocal lattice vector. Peierls' 
theory may be summarized (very roughly) as follows: 
in the absence of Umklap processes, the mean free 
path, and thus the thermal conductivity of an insulat- 
ing solid, is infinite. A success of Peierls’ theory is to 
describe correctly the temperature dependence of the 
thermal conductivity. Furthermore, on the basis of this 
theory, one does not expect a finite thermal conduc- 
tivity in one-dimensional monoatomic lattices with 
pair interactions. This seems so far to be a correct 
prediction, at least in the numerous numerical results 
performed on various models. 


Statistical Mechanics Paradigm: 
Rigorous Analysis 


In a rigorous approach to the above arguments, we 
have to first formulate precisely the problem on a 


mathematical level. It is natural to adapt the standard 
formalism of statistical mechanics to our situation. To 
this end, we assume that our system is described by 
the positions O and momenta P of a (very large) 
number of particles, N, with Galu . ,dN) € 
AN ACR -and P= (Pisc Pu ER aN The 
dynamics (in the bulk) is given by a Haniltonbsn 
function H(Q,P). A state of the system is a 
probability measure j4(P,O) on phase space. As 
usual in statistical mechanics, the value of an 
observable f(P,O) will be given by the expected 
value of f with respect to the measure ju. In the case of 
a fluid contained in a region A, we can assume that 
the Hamiltonian has the form 


H(P.Q) - Y : +> ola; - qi) + ula) 
i-1 [Ft 
N 


-y [7] 


i=] 


where ó(q) is some short-range interparticle potential 
and u(q;) an external potential (e.g., the interaction of 
the particle with fixed obstacles such as a conduction 
electron interacting with the fixed crystalline ions). If 
we want to describe the case in which the temperature 
at the boundary is kept different in different regions 
ða, we have to properly define the dynamics at the 
boundary of the system. A possibility is to use 
“Maxwell boundary conditions”: when a particle hits 
the wall in OA,, it gets reflected and re-emerges with a 
distribution of velocities 

re] dv [8] 


RT a 


2 


f. (dv) = 7 |vx| exp E 


m 
27(kT,) 
Several other ways to impose boundary conditions 
have been considered in the literature. The notion of 
LTE can be made precise here in the so-called 
hydrodynamic scaling limit (HSL), where the ratio 
of microscopic to macroscopic scales goes to zero. 
The macroscopic coordinates r and t are related to 
the microscopic ones q and 7, by r=eq and t=e°7, 
that is, if A is a cube of macroscopic sides /, then its 
sides, now measured in microscopic length units, are 
of length L=«'/. We then suppose that at t — 0 our 
system of N—pL4 particles is described by an 
equilibrium Gibbs measure with a temperature 
T(r)=T(eq): roughly speaking, the phase-space 
ensemble density has the form 


po(P, Q) -ef= ET €q;) 


p? 
X E ge » lq; — d;) + a) | [9] 


jži 


Fourier Law 377 


where 55! (r) ^ To(r). In the limit e — 0, p fixed, the 
system at £ — 0 will be macroscopically in LTE with 
a local temperature To(r) (as already noted, here we 
suppress the variation in the particle density z(r)). 
We are interested in the behavior of a macroscopic 
system, for which «<1, at macroscopic times 
t>0, corresponding to microscopic times 
T=e “t,a=2 for heat conduction or other diffu- 
sive behavior. The implicit assumption then made 
in the macroscopic description given earlier is that, 
sincé the variations in To(r) are of order € on a 
microscopic scale, then for e < 1, the system will, 
also at time £, be in a state very close to LTE, with 
a temperature T(r,t) that evolves in time according 
to Fourier’s law, eqn [1]. From a mathematical 
point of view, the difficult problem is to prove that 
the system stays in LTE for t>0O when the 
dynamics are given by a Hamiltonian time evolu- 
tion. This requires proving that the macroscopic 
system has some very strong ergodic properties, for 
example, that the only time-invariant measures 
locally absolutely continuous with respect to the 
Lebesgue measure are, for infinitely extended 
spatially uniform systems, of the Gibbs type. This 
has only been proved so far for systems evolving 
via stochastic dynamics (e.g., interacting Brownian 
particles or lattice gases). For such stochastic 
systems, One can sometimes prove the hydrodyna- 
mical limit and derive macroscopic transport 
equations for the particle or energy density and 
thus verify the validity of Fourier law. Another 
possibility, as we already saw, is to use the 
Boltzmann equation. Using ideas of hydrodynami- 
cal space and time scaling described earlier, it is 
possible to derive a controlled expansion for the 
solution of the stationary Boltzmann equation 
describing the steady state of a gas coupled to 
temperature reservoirs at the top and bottom. One 
then shows that for e < 1, e being now the ratio 
A/L, the Boltzmann equation for f in the slab has a 
time-independent solution which is close to a local 
Maxwellian, corresponding to LTE (apart from 
boundary layer terms) with a local temperature and 
density given by the solution of the Navier-Stokes 
equations which incorporates Fourier's law as 
expressed in eqn [2]. The main mathematical 
problem is in controlling the remainder in an 
asymptotic expansion of f in power of e. This 
requires that the macroscopic temperature gradient, 
that is, |T4 — T2|/h, where b — eL is the thickness of 
the slab on the macroscopic scale, be small. Even if 
this apparently technical problem could be over- 
come, we would still be left with the question of 
justifying the Boltzmann equation for such steady 
states and, of course, it would not tell us anything 
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about dense fluids or crystals. In fact, the Boltz- 
mann equation itself is really closer to a macro- 
scopic than to a microscopic description. It is 
obtained in a well-defined kinetic scaling limit in 
which, in addition to rescaling space and time, the 
particle density goes to zero, that is, 和 > c. 

A simplified model of a crystal is characterized by 
the fact that all atoms oscillate around given 
equilibrium positions. The equilibrium positions 
can be thought of as the points of a regular lattice 
in Rf, say Z^. Although d=3 is the physical 
situation, one can also be interested in the case 
d=1,2. In this situation, A C Z with cardinality 
N, and each atom is identified by its position 
x; —i--q;, where i € A and q; € Rf is the displace- 
ment of the particle at lattice site 7 from this 
equilibrium position. Since interatomic forces in 
real solids have short range, it is reasonable to 
assume that the atoms interact only with their 
nearest neighbors via a potential that depends only 
on the relative distance with respect to the equili- 
brium distance. Accordingly, the Hamiltonians that 
we consider have the general form 


HPO) = 2, 5 Pi | >》 V(aq; — 9) 9 , Ui(q)) 
iEA li-jl-1 i 
= ml pi +V(O [10] 
ic A 
where P= (p;);;4 and analogously for O. We shall 


further assume that as |d| — oo so do U;(q) and 
V(q). The addition of U;(q) pins down the crystal 
and ensures that exp [—GH(P, O)] is integrable with 
respect to dPdQ, and thus the corresponding Gibbs 
measure is well defined. In this case, in order to fix the 
temperature at the boundary, one can add a Langevin 
term to the equation of particles on the boundaries, 
that is, if i € OA, the equation for the particle is 


p; = -04H(P,O) - Ap; + 


where w; is a standard white noise. Other thermo- 
statting mechanisms can be considered. In this case 
we can also define LTE using eqn [9] but we run 
into the same difficulties described above — although 
the problem is somehow simpler due to the presence 
of the lattice structure and the fact that the particles 
oscillate close to their equilibrium points. We can 
obtain Fourier's law only by adding stochastic 
terms, for example, terms like eqn [11], to the 
equation of motion of every particle and assuming 
that U(q) and V(q) are harmonic. These added 
noises can be thought of as an effective description 
of the chaotic motion generated by the anharmonic 
terms in U(q) and V(q). 


MT; — [11] 


Just how far we are from establishing rigorously 
the Fourier law is clear from our very limited 
mathematical understanding of the stationary 
nonequilibrium state (SNS) of mechanical systems 
whose ends are, as in the example of the Benard 
problem, kept at fixed temperatures T; and T3. 
Various models have been considered, for exam- 
ple, models with Hamiltonian [10] coupled at the 
boundaries with heat reservoirs described by eqns 
[11]. The best mathematical results one can prove 
are: the existence and uniqueness of SNS; the 
existence of a stationary nontrivial heat flow; 
properties of the fluctuations of the heat flow in 
the SNS; the central-limit theorem type fluctua- 
tions (related to Kubo formula and Onsager 
relations; and large-deviation type fluctuations 
related to the Gallavotti-Cohen fluctuation theo- 
rem). What is missing is information on how the 
relevant quantities depend on the size of the 
system, N. In this context, the heat conductivity 
can be defined precisely without invoking LTE. To 
do this, we let / be the expectation value in the SNS 
of the energy or heat current flowing from reservoir 
] to reservoir 2. We then define the conductivity 
Ky, as J/(A6T/L), where 6T/L —(T, — T3)/L is the 
effective temperature gradient for a cylinder of 
microscopic length L and uniform cross section A, 
and «(T) is the limit of «, when 
ôT —0(T; — T; =T) and Loo. The existence of 
such a limit with « positive and finite is what one 
would like to prove. 


See also: Dynamical Systems and Thermodynamics; 
Ergodic Theory; Interacting Particle Systems and 
Hydrodynamic Equations; Kinetic Equations; 
Nonequilibrium Statistical Mechanics: Dynamical 
Systems Approach; Nonequilibrium Statistical 
Mechanics: Interaction Between Theory and Numerical 
Simulations. 
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Introduction 


The Fourier-Mukai transform has been introduced 
in the study of abelian varieties by Mukai and can 
be thought of as a nontrivial algebro-geometric 
analog of the Fourier transform. Since its original 
introduction, the Fourier-Mukai transform turned 
out to be a useful tool for studying various aspects 
of sheaves on varieties and their moduli spaces, and 
as a natural consequence, to learn about the 
varieties themselves. Various links between geome- 
try and derived categories have been uncovered; for 
instance, Bondal and Orlov proved that Fano 
varieties, and certain varieties of general type, can 
be reconstructed from their derived categories. 
Moreover, Orlov proved a derived version of the 
Torelli theorem for K3 surfaces and also a structure 
theorem for derived categories of abelian varieties. 
Later, Kawamata gave evidence to the conjecture 
that two birational smooth projective varieties with 
trivial canonical sheaves have equivalent derived 
categories, which has been proved by Bridgeland in 
dimension 3. 

The Fourier-Mukai transform also enters into 
string theory. The most prominent example is 
Kontsevich's homological mirror-symmetry conjec- 
ture. The conjecture predicts (for mirror dual pairs 
of Calabi-Yau manifolds) an equivalence between 
the bounded derived category of coherent sheaves 
and the Fukaya category. The conjecture implies a 
correspondence between certain self-equivalences 
(given by Fourier-Mukai transforms) of the derived 
category and symplectic self-equivalences of the 
mirror manifold. 

Besides their importance for geometrical aspects 
of mirror symmetry, the Fourier-Mukai transforms 
have also been important for heterotic string 
compactifications. The motivation for this came 
from the conjectured correspondence between the 
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heterotic string and F-theory, which both rely on 
elliptically fibered Calabi-Yau manifolds. To give 
evidence for this correspondence, an explicit descrip- 
tion of stable holomorphic vector bundles was 
necessary and inspired a series of publications by 
Friedman, Morgan, and Witten. Their bundle con- 
struction relies on two geometrical objects: a 
hypersurface in the Calabi-Yau manifold together 
with a line bundle on it; more precisely, they 
construct vector bundles using a relative Fourier- 
Mukai transform. 

Various aspects and refinements of this construc- 
tion have been studied by now. For instance, a 
physical way to understand the bundle construction 
can be given using the fact that holomorphic vector 
bundles can be viewed as D-branes and that 
D-branes can be mapped under T-duality to new 
D-branes (of different dimensions). 

We survey aspects of the Fourier-Mukai trans- 
form, its relative version and outline the bundle 
construction of Friedman, Morgan, and Witten. The 
construction has led to many new insights, for 
instance, the presence of 5-branes in heterotic string 
vacua has been understood. The construction also 
inspired a tremendous amount of work towards a 
heterotic string phenomenology on elliptic Calabi- 
Yau manifolds. For the many topics omitted the 
reader should consult the “Further reading” section. 


The Fourier-Mukai Transforms 


Every object E of the derived category on the 
product X x Y of two smooth algebraic varieties X 
and Y gives rise to a functor $^ from the bounded 
derived category D(X) of coherent sheaves on X to 
the similar category on Y: 


$.: D(X) 一 D(Y) 

F= 9*(F) = Ri,(x*F Q E) 
where 7,7 are the projections from X x Y to X 
and Y, respectively, and & denotes the derived 


tensor product. o^(F) is called Fourier-Mukai 
transform with kernel E € D(X x Y) (in analogy 
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with the definition of an integral transform with 
kernel). Note that given a Fourier-Mukai functor 
E, BE(F) is in general a complex having homol- 
ogy in several degrees even if F is a sheaf. 
Furthermore, a result by Orlov states that if X 
and Y are smooth projective varieties then any 
fully faithful functor D(X) — D(Y) is a Fourier- 
Mukai functor. 

In analogy with the Fourier transform, there is a 
kind of “convolution product” giving the composi- 
tion of two such functors. More precisely, given 
smooth algebraic varieties X, Y, Z, and elements E € 
D(X x Y) and G € D(Y x Z), we can define o E € 
D(X x Z) by 


GoE= Rzxz, (Xy E 的 TyzCr) 


where TXy,TYZz,TXZ are the projections from X x 
Y x Z to the pairwise products giving a natural 
isomorphism of functors 


pE ö pE - pork 


Another analogy with the Fourier transform can 
be drawn. For this, assume that we have sheaves F 
and G which only have one nonvanishing Fourier- 
Mukai transform, the ith one 4'(F) (where ©’: 
D(X) — Coh(Y), F — H'(®£(F)); cf. remarks below) 
in the case of F, and the jth one ®/(G) in the case 
of G. Given such sheaves, there is the Parseval 
formula 


Ext (F, G) = Exty' ^ (9 (F), ®/(G)) 


which gives a correspondence between the exten- 
sions of F,G and the extensions of their Fourier- 
Mukai transforms. This formula can be considered 
as the analog of the Parseval formula for the 
ordinary Fourier transform for functions on a torus. 

The Parseval formula can be proved using two 
facts. First, for arbitrary coherent sheaves E, G the 
Ext groups can be computed in terms of the derived 
category, namely 


Ext (E, G) = Homp,x)(E, G[i]) 


Second, the Fourier-Mukai transforms of F and G in 
the derived category D(X) are given by ®(F)= 
$'(F)[-;] and ®(G) = ?/(G)[-j]. Since the Fourier- 
Mukai transform is an equivalence of categories, we 
have 


Homp,x)(F, G[]) = Homp(x;(9'(F), &/(G)[i — j + b]) 


implying the Parseval formula. 

A first simple example of a Fourier-Mukai functor 
can be given: let F be the complex in D(X x X) 
defined by the structure sheaf O4 of the diagonal 


ACXxX. Then it is easy to check that 4*: 
D(X) — D(X) is isomorphic to the identity functor 
on D(X). Moreover, if we shift degrees by n taking 
F= O&[n] (a complex with only the sheaf O4 placed 
in degree n), then 9^: D(X) — D(X) is the degree 
shifting functor 9 — G[n]. 

As we will be interested in relative Fourier-Mukai 
transforms for elliptic fibrations, let us consider the 
case of a Fourier- Mukai transform on an elliptic 
curve: consider an elliptic curve E with a fixed 
origin po and identify E with E=Pic’(E) via 
f:E—E,xe Og(x — po). As kernel we take the 
normalized Poincaré line bundle P:= Og.g(A — 
{po} x E — E x (po]). The restriction of P to po x E 
or E x po is isomorphic to the trivial line bundle 
O. P has the universal property which can be 
expressed by ^"(k(x)) —f(x) where k(x) is the 
sheaf supported at a point x € E; in particular, 
$^ (k(po)) 2 Og and 9^(Oz) — k(po)[-1], where OF 
is the structure sheaf of E. 


Relative Fourier-Mukai Transforms 
for Elliptic Fibrations 


It is often convenient to study problems for families 
rather than for single varieties. The main advantage 
of the relative setting is that base-change properties 
(or parameter dependencies) are better encoded into 
the problem. We can do that for Fourier-Mukai 
functors as well. To this end, we consider two 
morphisms p:X—B,p:X-— B of algebraic vari- 
eties. We will assume that the morphisms are flat 
and so give nice families of algebraic varieties. We 
shall define relative Fourier-Mukai functors in this 
setting by means of a "kernel" E in the derived 
category D(X xg X). 

Let us make the relative setting explicit for elliptic 
fibrations: an elliptic fibration is a proper flat 
morphism p:X-— B of schemes whose fibers are 
Gorenstein curves of arithmetic genus 1. We also 
assume that p has a section ea: B — X taking values 
in the smooth locus X' — B of p. The generic fibres 
are then smooth elliptic curves, whereas some 
singular fibers are allowed. If the base B is a smooth 
curve, elliptic fibrations were studied and classified 
by Kodaira, who described all the types of singular 
fibers that may occur, the so-called Kodaira curves. 
When the base is a smooth surface, more compli- 
cated configuration of singular curves can occur and 
have indeed been studied by Miranda. 

First let us fix notation and setup. We denote by 
c — c(B) the image of the section, by X, the fiber of 
p over t € B (we assume, in what follows, B is either 
a smooth curve or surface) and by i,: X, — X the 
inclusion. Furthermore, wx/B is the relative dualizing 


sheaf and w= R'p,Ox = (p.wx;g)', where the iso- 
morphism is Grothendieck-Serre duality for p. The 
sheaf L — p.wx;g is a line bundle whose first Chern 
class we denote by K=c,(L). The adjunction 
formula for c — X gives that o? = —o - p*K as cycles 
on X. Moreover, we will consider elliptic fibrations 
with a section whose fibers are all geometrically 
integral. This means that the fibration is isomorphic 
with its Weierstrass model. 

From Kodaira's classification of possible singular 
fibers one finds that the components of reducible 
fibers of p which do not meet o form rational double 
point configurations disjoint from c. Let X — X be 
the result of contracting these configurations and let 
p:X-— B be the induced map. Then all fibers of p 
are irreducible with at worst nodes or cusps as 
singularities. In this case, one refers to X as the 
Weierstrass model of X. 

The Weierstrass model can be constructed as 
follows: the divisor 3c is relatively ample and, if 
E=p,Ox(30) => Op Bw Bw and p: P=P(E*) — B 
is the associated projective bundle, there is a 
projective morphism j: X — P such that j(X) — X. 

Now special fibers of X — B can have at most 
one singular point, either a cusp or a simple node. 
Thus, in this case 3o is relatively very ample 
and gives rise to a closed immersion j: X —P 
such that ;*Op(1) — Ox(3e), where j is locally a 
complete intersection whose normal sheaf is 
N(X/P) 25 1*w^95 @ Ox(90). This follows by rela- 
tive duality since wp;g = A p/p = Tw (—3), due 
to the Euler exact sequence 


0 一 7 -= T*E(—1) = Op — 0 


The morphism p:X- B is then a local complete 
intersection morphism (cf. Fulton (1984)) and has a 
virtual relative tangent bundle Tx /g=[j*Tp;z| 一 
[A x;p] in the K-group K*(X). The Todd class of 
Tx,;g is given by 


Td(Tx,g) —1- Ip'K + i5 (12e : pK ^: 13p ! K^) 
- ło. p |K? + terms of higher degree 


Now if p: X B denotes the dual elliptic fibration, 
defined as the relative moduli space of torsion-free 
rank-1 sheaves of relative degree 0, it is known that 
for t € B there is an isomorphism X, = X, between 
the fibers of both fibrations. Since we assume that 
the original fibration p: X — B has a section o, then 
p and p are globally isomorphic; hereafter we 
identify X = X, where X denotes the compactified 
relative Jacobian of X. 

Note that X is the scheme representing the 
functor which, to any scheme morphism $:5— B, 
associates the space of equivalence classes of S-flat 
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sheaves on p,: X xg $— S, whose restrictions to the 
fibers of ó are torsion-free (the usual definition of 
"torsion free" is only for integral varieties, i.e., 
varieties whose local rings have no zero-divisors. In 
this case, a sheaf M is torsion free if for any open 
subset U, any nonzero section m of M on U and 
any nonzero section a of the relevant functions 
sheaf, one has a-m #0. When the variety is not 
integral (it is reducible, or nonreduced) this defini- 
tion has no real meaning, then what substitutes the 
notion of “torsion free" is the Simpson definition 
of “pure of maximal dimension”: a sheaf M is 
“torsion free” in this sense if the support of any of 
its subsheaves is the whole variety (cf. Huybrechts 
and Lehn (1997)), of rank 1 and degree 0; two such 
sheaves F,F' are considered to be equivalent if 
F' =F @ pL for a line bundle £ on S (cf. Altman 
and Kleiman (1980); note the Altman—Kleiman 
compactification of the relative Jacobian applies to 
our situation since we consider elliptic fibrations 
with integral fibers). Moreover, the natural morph- 
ism X X,x++TZ, @Ox,(o(t)) is an isomorphism 
(of B-schemes); here Z, is the ideal sheaf of the 
point x in X;. 

Note also that if m: Y — X, is the normalization 
of one of our fibers X, and z is the exceptional 
divisor (the pre-image of the singular point x) then 
™,(Oy(—z)) is the maximal ideal of x. 

The variety X is a fine moduli space. This means 
that there exists a coherent sheaf P on X xg X flat 
over X, whose restrictions to the fibers of p are 
torsion free, and of rank 1 and degree 0. The sheaf 
P is defined, up to tensor product, by the pullback 
of a line bundle on X, and is called the universal 
Poincaré sheaf, which we will normalize by letting 
P os ~ Ox. We shall henceforth assume that P is 
normalized in this way, so that 


P—I4Gmn'Ox(c)&'Ox(c) & gw 


where 7, and qQ— pov —p o f refer to the diagram 


XxgX ——- X 
dns 
p 


and TA is the ideal sheaf of the diagonal immersion 
X — X xp X. 

Starting with the diagram and with the kernel 
given by the normalized relative universal Poincaré 
sheaf P on the fibered product X xg X, we define 
the relative Fourier-Mukai transform as 


® = P? : D(X) — D(X) 
F= (F) = Ri,(a°F Q P) 
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Note that ®(F) can be generalized if we allow 
changes in the base space B, that is, we consider 
base-change morphisms g: $ — B. 

We close this section with some remarks: 


e An important feature of Fourier-Mukai functors 
is that they are exact as functors of triangulated 
categories. In more familiar terms, we can say 
that for any exact sequence 0 一 人 一 大 一 9 一 0 
of coherent sheaves in X, we obtain an exact 
sequence 


o DG) + O(N) > D(F) — e(g) o 
PHN) —9" 


where we have written $ — 4. and 44(F)— 
H'(®(F)) denotes the ith cohomology sheaves of 
the complexes (F). 

Given a Fourier-Mukai functor $^, a complex 
F in D(X) satisfies the WIT; condition (or is WIT;) 
if there is a coherent sheaf G on X such that 
$F*(F) ~ Gli] in D(X), where G[;] is the associated 
complex concentrated in degree i. Furthermore, 
we say that F satisfies the IT; condition if, in 
addition, G is locally free. 
_ When the kernel E is simply a sheaf Q on X x 
X flat over X, the cohomology and base-change 
theorem (cf. Hartshorne (1977)) allows one to 
show that a coherent sheaf F on X is IT; if and 
only if H'(X,  & Q¢)=0 for all € € X and for all 
j Zi, where Qe denotes the restriction of Q to 
X x {€} and F is WITo if and only if it is ITo. 

The acronym “IT” stands for “index theorem,” 
while *W" stands for “weak.” This terminology 
comes from Nahm transforms for connections on 
tori in complex differential geometry. 

e The Parseval formula for the relative Fourier- 
Mukai transform has been proved by Mukai in 
his original Fourier-Mukai transform for abelian 
varieties and can be extended to any situation 
in which a Fourier-Mukai transform is fully 
faithful. 

e For physical applications, it is often convenient to 
work in cohomology H*(X, Q). The passage from 
D(X) to H*(X, Q) can be described as follows. We 
first send a complex Z € D(X) to its natural class 
in the K-group; we then make use of the fact that 
the Chern character ch maps K(X) —^ CH'(X) 89 Q 
and finally we apply the cycle map to H*(X, Q). 
This passage (by abuse of notation) is often denoted 
by ch:D(X)— H**"(X, Q), it commutes with 
pullbacks and transforms tensor products into 
dot products. Moreover, if we substitute the 
Mukai vector v(Z)=ch(Z),/Td(X) for the Chern 


character ch(Z) then we find the commutative 


diagram 
D(x) —9 ~- D(Y) 


H'(X, Q)-9 H*(Y, Q) 


This can be shown using the Grothendieck- 
Riemann-Roch theorem and the fact that the 
power series defining the Todd class starts with 
constant term 1 and thus is invertible. 


Vector Bundles for Heterotic Strings 


A compactification of the ten-dimensional heterotic 
string is given by a holomorphic, stable G-bundle V 
(with G some Lie group specified below) over a 
Calabi-Yau manifold X. The Calabi-Yau condition, 
the holomorphy and stability of V are a direct 
consequence of the required supersymmetry in the 
uncompactified spacetime. We assume that the 
underlying ten-dimensional space Mio is decom- 
posed as Mio = M4 x X, where My (the uncompac- 
tified spacetime) denotes the four-dimensional 
Minkowski space and X a six-dimensional compact 
space given by a Calabi-Yau 3-fold. To be more 
precise: supersymmetry requires that the connection 
A on V satisfies 


P p ye =ü. Fi AF E 


where J denotes a Kahler form of X. It follows that 
the connection has to be a holomorphic connection 
on a holomorphic vector bundle and, in addition, 
satisfies the Donaldson-Uhlenbeck-Yau equation, 
which has a unique solution if and only if the vector 
bundle is polystable. 

In addition to X and V, we have to specify a 
B-field on X of field strength H. In order to get an 
anomaly-free theory, the Lie group G is fixed to be 
either Eg x Eg or Spin(32)/Z2 or one of their 
subgroups and H must satisfy the identity 


dH =tr RAR =TrF AF 


where R and F are, respectively, the associated 
curvature forms of the spin connection on X and the 
gauge connection on V. Also tr refers to the trace of 
the composite endomorphism of the tangent bundle 
to X and Tr denotes the trace in the adjoint 
representation of G. For any closed four-dimen- 
sional submanifold X4 of the ten-dimensional space- 
time Mj», the 4-form tr R A R — Tr F A F must have 
trivial cohomology. Thus, a necessary topological 
condition V has to satisfy is ch2(TX)=ch2(V), 
which simplifies to c2(TX)=c2(V) for Calabi-Yau 
manifolds, V being an SU(m) vector bundle. 


A physical interpretation of the third Chern class 
can be given as a result of the decomposition of the 
ten-dimensional spacetime into a four-dimensional 
flat Minkowski space and X. The decomposition of 
the corresponding ten-dimensional Dirac operator 
with values in V shows that massless four- 
dimensional fermions are in one-to-one correspon- 
dence with zero modes of the Dirac operator Dy on 
X. The index of Dy can be effectively computed 
using the Hirzebruch-Riemann-Roch theorem and 
is given by 


index(D) = $ Td(X)ch(V) -5 c3(V) 


equivalently, we can write the index as index(D) = 
a (—1)* dim H*(X, V). For stable vector bundles, 
we have H?(X, V) =H?(X, V)=0 and so the index 
computes the net number of fermion generations Ngen 
in the respective model. 

Now it has been observed that the inclusion of 
background S-branes changes the anomaly con- 
straint. Various 5-brane solutions of the heterotic 
string equations of motion have been discussed in 
the gauge 5-brane, the symmetric 5-brane, and the 
neutral 5-brane. It has been shown that the gauge 
and symmetric 5-brane solutions involve finite-size 
instantons of an unbroken nonabelian gauge group. 
In contrast, the neutral 5-branes can be interpreted 
as zero-size instantons of the SO(32) heterotic 
string. The magnetic 5-brane contributes a source 
term to the Bianchi identity for the 3-form H, 


dH=trRAR-TrFAF+ns M. 6 


five-branes 


and integration over a 4-cycle in X gives the anomaly 
constraint 


cTX) = ex(V) + [W] 


The new term 6?! is a current that integrates to 1 in 
the direction transverse to a single 5-brane whose 
class is denoted by [W]. The class [W] is the 
Poincaré dual of an integer sum of all these sources 
and thus [W] should be an integral class, represent- 
ing a class in H(X, Z). [W] can be further specified 
taking by into account that supersymmetry requires 
that 5-branes are wrapped on holomorphic curves 
and thus |W] must correspond to the homology class 
of holomorphic curves. This fact constrains [W] to 
be an algebraic class. Further, algebraic classes 
include negative classes; however, these lead to 
negative magnetic charges, which are unphysical, 
and so they have to be excluded. This constrains [W] 
to be an effective class. Thus, for a given Calabi- 
Yau 3-fold X the effectivity of [W] constrains the 
choice of vector bundles V. 
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The study of the correspondence between the 
heterotic string (on an elliptic Calabi-Yau 3-fold) 
and F-theory (on an elliptic Calabi-Yau fourfold) 
has led Friedman, Morgan, and Witten to introduce 
a new class of vector bundles which satisfy the 
anomaly constraint with [W] nonzero. As a result, 
they prove that the number obtained by integration 
of [W] over the elliptic fibers of the Calabi-Yau 
3-fold agrees with the number of 3-branes given by 
the Euler characteristic of the Calabi-Yau fourfold 
divided by 24. 


Fourier-Mukai Transforms and Spectral Covers 


Let us now describe how the construction of vector 
bundles out of spectral data (first considered in 
Hitchin and Beauville, Narasimhan, and Ramanan) 
can be easily described in the case of elliptic 
fibrations by means of the relative Fourier-Mukai 
transform. This construction was widely exploited 
by Friedman, Morgan, and Witten to construct 
stable vector bundles on elliptic Calabi-Yau three- 
folds X, which we will summarize now. 

If V— X is a vector bundle of rank n which is 
semistable and of degree 0 on each fibre f of X — B, 
then its Fourier-Mukai transform ©'(V) is a torsion 
sheaf of pure dimension 2 on X. The support of 
$!(V) is a surface i: C — X, which is finite of degree 
n over B. Moreover, ®!(V) is of rank 1 on C and, if 
C is smooth, then @'(V)=i,L is just the extension 
by zero of some line bundle L € Pic(C). Conversely, 
given a sheaf G — X of pure dimension 2 which is 
flat over B, then ®(G) is a vector bundle on X of 
rank equal to the degree of supp(G) over B. 

This correspondence between vector bundles on X 
and sheaves on X supported on finite covers of B is 
known as the spectral cover construction. The 
torsion sheaf G is called the spectral sheaf (or line 
bundle) and the surface C=supp(G) is called the 
spectral cover. 

For the description of vector bundles on elliptic 
Calabi-Yau 3-folds X it is appropriate to take i,L 
with Chern characters given by (ng,n € H?(B,Q) 
and ag, sk € Z) 


cho(i,L) = 0, chı (iL) = no + 7*9 


cho(i,L) = on* ng + aef, ch3(#,L) = se 


The characteristic classes of the rank-z vector bundle 
V can be obtained if we apply the Grothendieck- 
Riemann-Roch theorem to the projection 7: 


ch(V) = x, [#*(ch(i,L)) ch(P)Td(Tx,g)] 


where Td(Tx;g) as given above. 


384 Fourier-Mukai Transform in String Theory 


To make sure that the construction leads to SU(m) 
vector bundles we set ng =(1/2)nc, giving c1(V) —0 
and the remaining Chern classes are given by 


a(V)=m(no+n(w),  cx(V)--2. 


where 
w = tc (B) (n? —n)+ (X —#)nn(n — ncı (B)) 


and 4€ H^'(C,Z) is some cohomology class 
satisfying mcy =0 € H^' (B, Z). The general solu- 
tion for y has been derived by Friedman, Morgan, 
and Witten and is given by y=A(no\. — rten + 
n"cc((B) and yj. = —Arn*n(a*n — nz'c(B))e with 
S=CNno. The parameter A has to be determined 
such that cı(L) is an integer class. If n is even, 
A=m(m EZ) and in addition we must impose 
11 — c1(B) modulo 2. If n is odd, À — m + 1/2. 

It remains to discuss the stability of V. The 
stability depends on the properties of the defining 
data C and L. If C is irreducible and L a line bundle 
over C then V will be a vector bundle stable with 
respect to the polarization 


J —^9o-4^7 Hp, E>0 


if e is sufficiently small. This has been proved by 
Friedman, Morgan, and Witten under the additional 
assumption that the restriction of V to the generic 
fiber is regular and semistable. Here Jọ refers to 
some arbitrary Kahler class on X and Hg a Kahler 
class on the base B. It implies that the bundle V can 
be taken to be stable with respect to / while keeping 
the volume of the fiber f of X arbitrarily small 
compared to the volumes of effective curves asso- 
ciated with the base. That / is actually a good 
polarization can be seen by assuming e — 0. Now we 
observe that z* Hp is not a Kahler class on X since 
its integral is non-negative on each effective curve C 
in X; however, there is one curve, the fiber f, where 
the integral vanishes. This means that 7* Hj is on the 
boundary of the Kahler cone and, to make V stable, 
we have to move slightly into the interior of the 
Kahler cone, that is, into the chamber which is 
closest to the boundary point 7*Hpg. Also we note 
that although 7* Hg is in the boundary of the Kahler 
cone, we can still define the slope j4-5,(V) with 
respect to it. Since (1^Hg)^ is some positive multiple 
of the class of the fiber 1, semistability with respect 
to z'Hg is implied by the semistability of the 
restrictions V|; to the fibers. Assume that V is not 
stable with respect to J, then there is a destabilizing 
sub-bundle V' C V with j4(V') > uj(V). But semi- 
stability along the fibers says that jwH,(V') < 
Ha HO (V). If we had equality, it would follow that 
V' arises by the spectral construction from a proper 


subvariety of the spectral cover of V, contradicting 
the assumption that this cover is irreducible. So we 
must have a strict inequality j4,5,(V') < 1: n, (V). 
Now taking € small enough, we can ensure that 
Bj (V) < uj(V), thus V’ cannot destabilize V. 


D-Branes and Homological Mirror 
Symmetry 


Kontsevich proposed a homological mirror symme- 
try for a pair (X, Y) of mirror dual Calabi-Yau 
manifolds; it is conjectured that there exists a 
categorical equivalence between the bounded 
derived category D(X) and Fukaya's A, category 
F(Y), which is defined by using the symplectic 
structure on Y. A Lagrangian submanifold with a 
flat bundle gives an object of F(Y). If we consider a 
locally trivial family of symplectic manifolds Y (i.e., 
the symplectic form is locally constant as we vary Y 
in the family) the object of F(Y) undergoes mono- 
dromy transformations going round a loop in the 
base. On the other hand, the object of D(X) is a 
complex of coherent sheaves on X and under the 
categorical equivalence between D(X) and F(Y) the 
monodromy (of 3-cycles) is mapped to certain self- 
equivalences in D(X). 

Since all elements in D(X) can be represented by 
suitable complexes of vector bundles on X, we can 
consider the topological K-group and the image 
Kpoi(X) of D(X). The Fourier-Mukai transform 
$*:D(X)— D(X) induces then a corresponding 
automorphism Kpoi(X) — Ky4(X) and also an auto- 
morphism on H*'*"(X,Q) if we use the Chern 
character ring homomorphism ch: K(X) — H** 
(X,Q), as described above. With this in mind, we 
can introduce various kernels and their associated 
monodromy transformations. 

For instance, let D be the associated divisor defining 
the large-radius limit in the Kahler moduli space and 
consider the kernel O,(D), with A being the diagonal 
in X x X. The corresponding Fourier-Mukai trans- 
form acts on an object G € D(X) as twisting by a line 
bundle, that is, G — G & O(D). This automorphism is 
then identified with the monodromy about the large 
complex structure limit point (LCSL point) in the 
complex structure moduli space. 

Furthermore, if we consider the kernel given by 
the ideal sheaf Z4 on A, we find that the action of 
ta on H(X) can be expressed by taking the 
Chern character ring homomorphism: 


ch($^^(G)) = cho($9*-* (G)) — ch(G) 
a (/ ch(G) -Ta(X)) — ch(G) 


Kontsevich proposed that this automorphism 
should reproduce the monodromy about the princi- 
pal component of the discriminant of the mirror 
family Y. At the principal component we have 
vanishing S? cycles (and the conifold singularity), 
thus the action of this monodromy on cohomology 
may be identified with the Picard—Lefschetz formula. 

Now for a given pair of mirror dual Calabi-Yau 
3-folds, it is generally assumed that A-type and 
B-type D-branes exchange under mirror symmetry. 
For such a pair, Kontsevich's correspondence 
between automorphisms of D(X) and monodromies 
of 3-cycles can then be tested. More specifically, a 
comparison relies on the identification of two 
central charges associated to D-brane configurations 
on both sides of the mirror pair. 

For this, we first have to specify a basis for the 
3-cycles X; € H?(Y, Z) such that the intersection form 
takes the canonical form Y; - X; —6;;,5, 1+1 = Mj for 
i — 0,..., 55 4. It follows that a 3-brane wrapped about 
the cycle £= > ny’ has an (electric, magnetic) 
charge vector n = (n;). The periods of the holomorphic 
3-form €) are then given by 


w=] 9 
» 


f 


and can be used to provide projective coordinates on 
the complex structure moduli space. If we choose a 
symplectic basis (Aj, Bj) of H5(Y,Z) then the A; 
periods serve as projective coordinates and the B; 
periods satisfy the relations IT =n; O7 /OII', where 
F is the prepotential which has, near the large- 
radius limit, the asymptotic form (as analyzed by 
Candelas, Klemm, Theisen, Yau, and Hosono, cf. 
*Further reading"): 


1 1 
F= é ` R abotatpt 十 2 Kk» Cabtath 


abc ab 
2(X)Ja 3 
B y C2 M E — x(X) + const. 


where x(X) is the Euler characteristic of X,c are 
rational constants (with c,,=c,,) reflecting an 
Sp(2b!! +2) ambiguity, and kabe is the classical 
triple intersection number given by 


Rabe = [ I NI Ale 
JX 


The periods determine the central charge Z(») of a 
3-brane wrapped about the cycle X = 57, ;|Y;]: 


Z(n) = [2 = 2 nil; 
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On the other hand, the central charge associated 
with an object E of D(X) is given by 


Z(E) — — | E ch(E)(1+ 3?) 


Now, physically it is assumed that the two central 
charges are to be identified under mirror symmetry. 
If we compare the two central charges Z(m) and 
Z(E), then we obtain a map relating the Chern 
characters ch(E) of E to the D-brane charges n. If we 
insert the expressions for ch(E) in ch(®74(E)), it 
yields a linear transformation acting on n, such that 
ng 一 7116 d n3, which agrees with the monodromy 
transformation about the conifold locus. 

Similarly, the monodromy transformation about 
the LCSL point corresponding to automorphisms 
[E] ^ [E & Ox(D)] can be made explicit. 

Using the central charge identification, the auto- 
morphism/monodromy correspondence has been 
made explicit for various dual pairs of mirror 
Calabi-Yau 3-folds (given as hypersurfaces in 
weighted projective spaces). This identification pro- 
vides evidence for Kontsevich's proposal of homo- 
logical mirror symmetry. 


See also: Derived Categories; Mirror Symmetry: A 
Geometric Survey. 
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Introduction 


Manifolds of dimension 4 play a distinguished role 
in physics and have done so ever since special and 
general relativity ushered in the celebrated four- 
dimensional spacetime. It is also the case that 
manifolds of dimension 4 play a distinguished role 
in mathematics: many generalities about manifolds 
of a general dimension do not apply in dimension 4; 
there are also phenomena in dimension 4 with no 
counterpart in other dimensions. 

This article describes some of the more important 
physical and mathematical properties of dimension 4. 
We begin with an account of some topological and 
geometric properties for manifolds in general, but 
avoiding dimension 4, and then embark on the 
dimension 4 discussion. The references at the end 
will serve to take the reader further into the subject. 


Topological, Piecewise-Linear, and 
Differentiable Structures for Manifolds 


In dealing with topological spaces which are mani- 
folds, one distinguishes three types of manifolds M: 
topological, piecewise-linear, and differentiable (also 
called smooth). It is possible to describe the more 
important differences between these three types 
using topological techniques. 

Consider then a manifold M of dimension n; M will 
always be assumed to be compact, connected and 
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closed unless we indicate the contrary. The type of M is 
determined by examining whether the transition 
functions g, are homeomorphisms, (invertible) piece- 
wise-linear maps, or diffeomorphisms. Now, since the 
transition functions are maps from one subset of R” to 
another, we introduce the groups TOP,, PL,, and 
DIFF, which are all the homeomorphisms, piecewise- 
linear maps, and diffeomorphisms of R", respectively. 
We are naturally led to the three sets of inclusions: 


TOP; c TOP c C TOP. Lo es 
Pla C. Pi. cC C fL. oC oo fi 
DIFF, c DIFF; c C DIFF, c 


For each of the three sets of inclusions we pass to 
the direct limit and construct the three limiting 
groups 


TOP, PL, DIFF i2] 


With these three groups are associated the classifying 
spaces BTOP, BPL and BDIFF. The transition 
functions g,5 are those of the tangent bundle to M; 
and there are three possible tangent bundles depending 
on the type of M and. we denote these tangent bundles 
by TMrop, TMp, and TMpjrr in an obvious nota- 
tion. Then to determine the tangent bundles TM Top, 
TMp;, and TMp;rr one simply selects an element of the 
homotopy classes 


IM, BTOP], [M,BPL], and [M, BDIFF] (3| 


respectively. 

Given this threefold hierarchy of manifold struc- 
tures one wishes to know when one can straighten 
out a topological manifold to make it piecewise 
linear; and also, when can one smooth a piece- 
wise-linear manifold to make it differentiable? 


If dimM » 5 of M these two questions can be 
formulated as lifting problems. 


TOP versus PL for dim M # 4 


Taking the first of them, so that we are comparing 
piecewise-linear and topological structures on M, 
one can check BPL fibers over BTOP with fiber 
TOP/PL yielding 


TOP/PL — BPL 


|x [4] 

BTOP 
A method for straightening out a PL manifold is now 
apparent: now a topological manifold is a choice of 


map &: M — BTOP, and a factorization of a through 
BPL will give M a PL structure. We show this below 


BPL 
By | or a-—m70f [5] 
M 二 BTOP 


The existence of the map 8:M — BPL satisfying 
a@=70 provides M with a PL structure and is a 
lifting of the map o from the base BTOP to the total 
space BPL. 

This lifting method, for passing from TOP 
structures to PL structures, does work, provided 
dim M > 5, since we have the stability result that 


TOP, TOP 
~ ILL p» 
Pr, "pp? "=? 6] 


For the map £ to exist the obstructions to the lifting 
which are cohomology classes of the form 


H**! (M; x (TOP/PL)) [7] 


must vanish. However, Kirby and Siebenmann have 
shown that 


TOP/PL ~ K(Z2,3) [8] 


where K(Z5,3) is Eilenberg-Mac Lane space so that 
its sole nonvanishing homotopy group is in dimen- 
sion 3 giving us 


ZZ iuw-3 
»(TOP/PL) = 4^7? 9 
Tn 2m F otherwise d 
Any obstruction to Fs existence is a class e(M), 
say, in 


H*(M; Z2) 


When e(M) vanishes, the map 8 exists and furnishes 
M with a PL structure; if e(M)=0 it is natural to go 
on to ask how many (homotopy classes of) such 8’s 
exist? Standard obstruction theory says the relevant 


dim M > 5 [10] 
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homotopy classes are just the whole cohomology 
group 


H*(M; x,(TOP/PL)) [11] 
which, since k = 3, is just 
H? (M; Z2) [12] 


So, for dimM » 5, we see that when a closed 
topological manifold M acquires a PL structure by 
the lifting process just described, then the possible 
distinct PL structures are isomorphic to 


H? (M; Z2) [13] 


which is not zero in general. 

Finally, if dim M < 3, then the notions PL and 
TOP coincide, so we are left with the case dim M — 4 
which we shall come to below. Now we wish to 
describe the next step in the sequence TOP, PL, 
DIFF which is the smoothing problem. 


PL versus DIFF for dim M 4 4 


Similar ideas are used to address the question of 
smoothing a piecewise-linear manifold — however, 
the results are different. Let us assume that M is a 
closed PL manifold with dim M » 5. This time the 
fibration is 


PL/DIFF 一  BDIFF 
| a [14] 
BPL 


The smoothing of a piecewise-linear M can also be 
handled with obstruction theory and leads us immedi- 
ately to the consideration of the homotopy groups 
™(PL/DIFF). This time the nontrivial homotopy 
groups of the fiber are much more numerous than in 
the piecewise-linear case. In fact one has 


0 ifn<6 

Zag if n7 

Zə if n= 8 
r,(PL/DIFF) = 4 . | [15] 

Z992 if nl 


The obstructions to passing from a PL to a DIFF 
structure on M now lie in 


H**! (M; zx, (PL/ DIFF)) [16] 


and the number of distinct liftings comprises the 
cohomology group 


H* (M; x, (PL/DIFF)) [17] 
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As an illustration of all this, consider the case 
M — S^; then the first nontriviality occurs when 1 — 7 
and so the obstruction to smoothing S” lies in 


H*(S": zz (PL/DIFF)) [18] 


which is of course zero — this means that $7 can be 
smoothed, a fact which we know from first 
principles. However, by the obstruction theory 
introduced above, the resulting smooth structures 
are isomorphic to 


H'(S';z;(PL/DIFF)) =H’ (S’;Z2g)=Zog [19] 


Hence, we have the celebrated result of Milnor and 
Kervaire and Milnor that S” has 28 distinct 
differentiable structures, 27 of which correspond to 
what are known as exotic spheres. 

Lastly, if dim M < 3, then PL and DIFF coincide — 
this leaves us with the case of greatest interest 
namely dim M — 4. 


The Strange Case of Four Dimensions 


In four dimensions there are phenomena which have 
no counterpart in any other dimension. First of all, 
there are topological 4-manifolds which have no 
smooth structure, though if they have a PL structure, 
then they possess a unique smooth structure. Second, 
the impediment to the existence of a smooth structure 
is of a completely different type to that met in the 
standard obstruction theory — it is not the pullback of 
an element in the cohomology of a classifying space, 
that is, it is not a characteristic class. Also the four- 
dimensional story is far from completely known. 
Nevertheless, there are some very striking results 
dating from the early 1980s onwards. 

We begin by disposing of the difference between 
PL and DIFF structures: our earlier results together 
with the vanishing statement 


m,(PL/DIFF) 2-0, n<6 [20] 


mean that every PL 4-manifold possesses a unique 
DIFF structure. Thus, we can take the crucial 
difference to be between DIFF and TOP. 

In Freedman (1982) all, simply connected, topo- 
logical 4-manifolds were classified by their intersec- 
tion form 4. 

We recall that q is a quadratic form constructed 
from the cohomology of M as follows: take two 
elements a and 8 of H*(M;Z) and form their cup 
product aU B € H*(M; Z); then we define g(a, 3) by 


q(a, B) = (a U B)|M] i21] 


where (aU 8)[M] denotes the integer obtained by 
evaluating aU 8 on the generating cycle [M] of the 


top homology group H4(M;Z) of M. Poincaré 
duality ensures that such a form is always non- 
degenerate over Z and so has detq = 1; q is then 
called unimodular. Also we refer to g, as “even” if 
all its diagonal entries are even, and as “odd” 
otherwise. 

Freedman's work yields the following: 


Theorem (Freedman). A simply | connected 
4-manifold M witb even intersection form q belongs 
to a unique homeomorphism class, while if q is 
odd there are precisely two nonhomeomorpbic 
manifolds M witb q as tbeir intersection form. 


This is a very powerful result — the intersection 
form q very nearly determines the homeomorphism 
class of a simply connected M, and actually only 
fails to do so in the odd case where there are still 
just two possibilities. Further, every unimodular 
quadratic form occurs as the intersection form of 
some manifold. 

As an illustration of the impressive nature of 
Freedman’s work, choose M to be the sphere $4, 
since H^(S^; Z) is trivial, then q is the zero quadratic 
form and is of course even; we write this as q =f. 
Now recollect that the Poincaré conjecture in four 
dimensions is the statement that any homotopy 
4-sphere, S7 say, is actually homeomorphic to S*. 
Well, since H?(S7; Z) is also trivial then any S} also 
has intersection form q—(. Applying Freedman’s 
theorem to $7 immediately asserts that $7 belongs to 
a unique homeomorphism class which must be that 
of St thereby establishing the Poincaré conjecture. 

Freedman's result combined with a much earlier 
result of Rohlin (1952) also gives us an example of a 
nonsmoothable 4-manifold: Rohlin's theorem asserts 
that given a smooth, simply connected, 4-manifold 
with even intersection form q, then the signature — 
the signature of g being defined to be the difference 
between the number of positive and negative eigen- 
values of q — o(q) of q is divisible by 16. 

Now write 


2-10 0 0 0 0 0 
į 2-1 00 0 0 0 
0-1 2-1 0 O0 0 O0 
0 0-1 2-1 0 6 0 
4=| 0 0 0-1 2-1 0-1| "P BH 
ü 68 0-i 2=i @ 
0 0 0 0 0-1 2 0 
0 0 0 0-10 0 2 


(Eg is actually the Cartan matrix for the exceptional 
Lie algebra eg), then, by inspection, g is even, and 
by calculation, it has signature 8. By Freedman’s 
theorem there is a single, simply connected, 4-mani- 
fold with intersection form q—Eg. However, by 


Rohlin’s theorem, it cannot be smoothed since its 
signature is 8. 

The next breakthrough was due to Donaldson 
(1983). Donaldson’s theorem is applicable to defi- 
nite forms g, which by appropriate choice of 
orientation on M we can take to be positive definite. 
One has: 


Theorem (Donaldson). A simply connected, smooth 
4-manifold, with positive-definite intersection form 
q is always diagonalizable over the integers to 
g= agil. 


One can immediately deduce that no, simply 
connected, 4-manifold for which q is even and 
positive definite can be smoothed! 

For example, the manifold with q = Eg @ Eg has 
signature 16 (by Rohlin’s theorem). But since Eg is 
even, then so is Eg Eg and so Donaldson’s 
theorem forbids such a manifold from existing 
smoothly. 

In fact, in contrast to Freedman’s theorem, which 
allows all unimodular quadratic forms to occur as 
the intersection form of some topological manifold, 
Donaldson's theorem says that in the positive- 
definite, smooth, case only ome quadratic form is 
allowed, namely I. 

Donaldson’s work makes contact with physics 
because it uses the Yang-Mills equations as we now 
outline. 

Let A be a connection on a principal SU(2) bundle 
over a simply connected 4-manifold M with posi- 
tive-definite intersection form. If the curvature 
2-form of A is F, then F has an L? norm which is 
the Euclidean Yang-Mills action S$. One has 


S= |n? = — l. tr(F ^ «F) 23 


where *F is the usual dual 2-form to F. The minima 
of the action S are given by those A, called 
instantons, which satisfy the famous self-duality 
equations 


F = F [24] 


Given one instanton A which minimizes § one can 
perturb about A in an attempt to find more 
instantons. This process is successful and the space 
of all instantons can be fitted together to form a 
global moduli space of finite dimension. For the 
instanton which provides the absolute minimum of 
S, the moduli space M is a noncompact space of 
dimension 5. 

We can now summarize the logic that is used to 
prove Donaldson’s theorem: there are very strong 
relationships between M and the moduli space M; 
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for example, let g be regarded as an m x » matrix 
with precisely p unit eigenvalues (clearly p < n and 
Donaldson's theorem is just the statement that 
p=n), then M has precisely p singularities which 
look like cones on the space CP?. These combine to 
produce the result that the 4-manifold M has the 
same topological signature Sign(M) as p copies of 
CP?; and so they have signature a — b, where a of 
the CP?’s are oriented as usual and b have the 
opposite orientation. Thus, 


Sign (M) =a — b [25] 


Now by definition, Sign (M) is the signature o(q) of 
the intersection form q of M. But, by assumption, q is 
positive definite n x n so o(q) =n = Sign (M). Hence, 


n=a—b [26] 
However, a 4- b — p and p € n so we can say that 
n-a-—b, p=at+b<n [27] 
but one always has a+ b > a — b so we have 
n<p<n>p=n i28] 


which is Donaldson's theorem. 


Donaldson's Polynomial Invariants 


Donaldson extended his work by introducing poly- 
nomial invariants also derived from Yang-Mills 
theory and to discuss them we must introduce 
some notation. 

Let M be a smooth, simply connected, orientable 
Riemannian 4-manifold without boundary and A be 
an SU(2) connection which is anti-self-dual so that 


F=—+*F [29] 


Then the space of all gauge-inequivalent solutions to 
this anti-self-duality equation — the moduli space 
Mg — has a dimension given by the integer 


dim Mi — 8k — 3(1 + bi) [30] 


Here k is tbe instanton number which gives the 
topological type of the solution A. The instanton 
number is minus the second Chern class c;(F)€ 
H?(M; Z) of the bundle on which the A is defined. 
This means that we have 


k =al FIM) = mah t(FAF)eZ [31] 


The number b> is defined to be the rank of the 
positive part of the intersection form q of M. 
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A Donaldson invariant q , iS a symmetric integer 
polynomial of degree d in the 2-homology H2(M; Z) 
of M 


qi, : H2(M) x ++» x Hy(M) — Z [32] 
——M— 
d factors 


Given a certain map Mi, 
mi : H;(M) > H (Mp) [33] 


if œ € H2(M) and * represents a point in M, we 
define qd (a) by writing 


qdr(a) = m$ (a)mg(*) [M] [34] 


The evaluation of [M;] on the RHS of the above 
equation means that 


2d + 4r = dim M; [35] 


so that M, is even dimensional; this is achieved by 
requiring b} to be odd. 

Now the Donaldson invariants qy , are differential 
topological invariants rather than topological invari- 
ants but they are difficult to calculate as they require 
detailed knowledge of the instanton moduli space 
Mg. However they are nontrivial and their values 
are known for a number of 4-manifolds M. For 
example, if M is a complex algebraic surface, a 
positivity argument shows that they are nonzero 
when d is large enough. Conversely, if M can be 
written as the connected sum 


M = Mi#M) 


where both M; and M3 have bł > 0, then they all 
vanish. 


Topological Quantum Field Theories 


Turning now to physics, it is time to point out that 
the q , can also be obtained, Witten (1988), as the 
correlation functions of twisted N=2 supersym- 
metric topological quantum field theory. 

The action S for this theory is given by 


1 V 1 * LV 
S = i DE PEU 十 Aa PF 


1 . l 
+ 7 9D,D'A +iD WX” = mD 


1 


= 8 O|X pv XP" I 5AlU,. V" 
i 1 2 
-3 [n.n] — $16. A | [36] 


where F,,,, is the curvature of a connection A, and 
(@, À, N, Wus Xu») are a collection of fields introduced 


in order to construct the right supersymmetric 
theory; ó and A are both spinless while the multiplet 
(V, Xu») contains the components of a O-form, a 
1-form, and a self-dual 2-form, respectively. 

The significance of this choice of multiplet is that 
the instanton deformation complex used to calculate 
dim M, contains precisely these fields. 

Even though S contains a metric, its correlation 
functions are independent of the metric g so that S 
can still be regarded as a topological quantum field 
theory. This is because both S and its associated 
energy momentum tensor T = (6S/6g) can be written 
as BRST commutators $—(Q, V, T—(O, V’} for 
suitable V and V". 

With this theory, it is possible to show that the 
correlation functions are independent of the gauge 
coupling and hence we can evaluate them in a small 
coupling limit. In this limit, the functional integrals 
are dominated by the classical minima of $, which 
for A,, are just the instantons 

P t [37] 
We also need ó and A to vanish for irreducible 
connections. If we expand all the fields around the 
minima up to quadratic terms and do the resulting 
Gaussian integrals, the correlation functions may be 
formally evaluated. 

A general correlation function of this theory is 
given by 


idees J DF exp|-S] P(F) 38) 


where F denotes the collection of fields present in 
S and P(F) is some polynomial in the fields. 

S has been constructed so that the zero modes in 
the expansion about the minima are the tangents to 
the moduli space Mg. This suggests doing the DF 
integration as follows: express the integral as an 
integral over modes, then integrate out all the 
nonzero modes first leaving a finite-dimensional 
integration over the compactified moduli space 
M,. The Gaussian integration over the nonzero 
modes is a boson-fermion ratio of determinants, 
which supersymmetry constrains to be +1, bosonic 
and fermionic eigenvalues being equal in pairs. 

This amounts to writing 


<P>= | P, [39] 
JM, 


where P, denotes some n-form over M, and 
n=dimM,. If the original polynomial P(F) is 
judiciously chosen, then calculation of <P> repro- 
duces evaluation of the Donaldson polynomials q 


The Seiberg-Witten Equations 


The Seiberg-Witten equations constitute another 
breakthrough in the work on the topology of 
4-manifolds, since they greatly simplify the calcula- 
tion of the data supplied by the Donaldson 
polynomial invariants. We shall discuss this later 
below but turn now to the equations themselves. 

If we choose an oriented, compact, closed, 
Riemannian manifold M, then the data we need for 
the Seiberg—Witten equations are a connection A on 
a line bundle L over M and a “local spinor” field v. 
The Seiberg-Witten equations are then 


Pap —0, | F'——3VTV [40] 


where 24 is the Dirac operator and T is made from 
the gamma matrices T; according to T= (1/2) 
IT;, L;]dx' ^ dx’. 

We call wv a local spinor because global spinors 
may not exist on M; however, in dimension 4, 
orientability guarantees that a spin, structure exists 
on M (a choice of spin, structure on M is an extra 
piece of data in the Seiberg-Witten case); w is then 
the appropriate section for the spin, bundle and 
behaves locally like a spinor coupled to the U(1) 
connection A. Let Spin((M) denote the set of 
isomorphism classes of spin, structures on M then, 
for the case bj » 1 — the case bj =1 has some 
technicalities — the Seiberg- Witten invariants deter- 
mine a map SW of the form 


SW : Spin, (M) — Z [41] 


We emphasize that A is just a U(1) abelian 
connection and so F=dA, with F* denoting the 
self-dual part of F. 

We shall now have a look at an example of a new 
result obtained directly from the Seiberg- Witten 
equations. The equations clearly provide the 
absolute minima for the action 


s= | (potare oreü) aa 
M 
If we use a Weitzenböck formula to relate the 


Laplacian V^ V4 to 97,94 plus curvature terms, we 
find that S satisfies 


| [tot n + or!) 


= | (IvavP HHE Ë Haee) Ha 


= | (vet +H? + shot +R) 
+w 6L) [44] 
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where R is the scalar curvature of M and c;(L) is the 
Chern class of L. 

We notice that the action now looks like one for 
monopoles. But now suppose that R is positive and 
that the pair (A,w) is a solution to the Seiberg- 
Witten equations, then the left-hand side (LHS) of 
this last expression is zero and all the integrands on 
the RHS are positive so the solution must obey y= 0 
and F* — 0. A technical point is that if M has bj > 1, 
then a perturbation of the metric can preserve the 
positivity of R but perturb F*=0 to be simply F— 0 
rendering the connection A flat. Hence, in these 
circumstances, the solution (A,w) is the trivial one. 
This means that we have a new kind of vanishing 
theorem in four dimensions. 


Theorem (Witten 1994). No 4-manifold with b3 > 
1 and nontrivial solution to the Seiberg—Witten 
equations admits a metric of positive scalar curvature. 


Now, for technical reasons, we assume that the 
M 
qj, have the property that 


Q1 ,42 T 49d， [45 | 


A simply connected M with this property is called of 
simple type. We also define q^! by writing 


a= s if d — (bj + 1) mod 2 
M — 


46 
+qy,, ifd-bj mod 2 46 


The generating function Gy(q@) is now given by 
DO 1 7 
Gua) = 5 aa) 47 


According to Kronheimer and Mrowka (1994), 
Gy(o) can be expressed in terms of a finite number 
of classes (known as basic classes) Kilki € H*(M)) 
with rational coefficients a; (the Seiberg-Witten 
invariants) resulting in the formula 


Gmu(a) = expla-a/2!| Da exp[«; - a] [48] 


Hence, for M of simple type, the polynomial 
invariants are determined by a (finite) number of 
basic classes and the Seiberg- Witten invariants. 

Returning now to the physics we find that the 
quantum field theory approach to the polynomial 
invariants relates them to properties of the moduli 
space for the Seiberg- Witten equations rather than 
to properties of the instanton moduli space Mg. 

The moduli space for the Seiberg-Witten equa- 
tions, unlike the instanton case, is compact and 
generically has dimension 


ci(L) — 2x(M) — 3e(M) 


; 49 


392 Four-Manifold Invariants and Physics 


x(M) and o(M) being the Euler characteristic and 
signature of M, respectively. When 


c(L) = 2x(M) + 30(M) [50] 


we get a zero-dimensional moduli space consisting 
of a finite collection of points 


[P3,..-; Pw] [51] 


Now each point P; has a sign e; — +1 associated 
with it coming from the sign of the determinant of 
elliptic operator whose index gave the dimension of 
the moduli space. The sum of these signs is an 
integer topological invariant denoted by n, that is, 


N 


17T = MEC [52] 


j=l 


Returning now to our formula for Gy(a), one 
finds that 


Gm(a) = 2?) expla-a/2] Son explei(L)-o] [53] 
T 


p(M) = 1+4(7x(M) + 110(M)) [54] 


and the sum over L on the RHS of the formula is 
over line bundles L that satisty 


ci (L) = 2x(M) + 3e(M) [55] 


that is, it is a sum over L with zero-dimensional 
Seiberg—Witten moduli spaces. 

Comparison of the two formulas for Gy(a) ~ the 
first mathematical in origin and the second physi- 
cal — allows one to identify the Seiberg-Witten 
invariants a; and the Kronheimer-Mrowka basic 
classes «x; as the c;(L)’s. 

The results described thus far are for simply 
connected 4-manifolds but this condition is not 
obligatory for and there is also a theory in the non- 
simply-connected case (Marino and Moore 1999). 

The physics underlying these topological results is 
of great importance since many of the ideas 
originate there. It is known that the computation 
of the Donaldson invariants there uses the fact that 
the N —2 gauge theory is asymptotically free. This 
means that the ultraviolet limit being one of weak 
coupling is tractable. However, the less tractable 
infrared or strong-coupling limit would do just as 
well to calculate the Donaldson invariants since 
these latter are metric independent. 

In Seiberg and Witten's work, this infrared 
behavior is actually determined and it is found 
that, in the strong-coupling infrared limit, the theory 
is equivalent to a weakly coupled theory of abelian 
fields and monopoles. There is also a duality 


between the original theory and the theory with 
monopoles which is expressed by the fact that the 
(abelian) gauge group of the monopole theory is the 
dual of the maximal torus of the group of the 
nonabelian theory. 

We recall that the Yang-Mills gauge group in this 
discussion is SU(2). Seiberg and Witten’s results 
mean the replacement of SU(2) instantons used to 
compute the Donaldson invariants by the counting 
of U(1) monopoles. This calculation of the non- 
abelian Donaldson data by abelian Seiberg—Witten 
data theory is much like the representation theory of 
a nonabelian Lie group G where everything is 
determined by an abelian object: the maximal torus. 

The theory considered by Seiberg and Witten 
possesses a collection of quantum vacua labeled by a 
complex parameter u which turns out to parametrize a 
family of elliptic curves. A central part is played by a 
function 7T(4) on which there is a modular action of 
SL(2, Z). The successful determination of the infrared 
limit involves an electric-magnetic duality and the 
whole matter is of very considerable independent 
interest for quantum field theory, quark confinement, 
and string theory in general. 


Seiberg-Witten Theory and Exotic 
Structures on 4-Manifolds 


We saw earlier that, when dim M Æ 4, a manifold 
may possess a finite number of differentiable 
structures, S” having 28 distinct smooth structures. 
However, in dimension 4, Seiberg-Witten theory has 
been used to show that there are many 4-manifolds 
with a countable infinity of smooth structures. We 
just mention two: the K3 surface has infinitely 
many smooth structures as does the manifold 
CP^45CP . This is another instance of how dimen- 
sion 4 differs from all other dimensions. This infinite 
variety of exotic smooth structures in four dimen- 
sions is also of great interest to physics. 

An outstanding four-dimensional matter still is the 
smootb Poincaré conjecture which asks whether a 
smooth 4-manifold M homotopic to $^ is diffeo- 
morphic to S^? Such an M is certainly homeomorphic 
to $^ because this is the standard Poincaré conjecture 
proved by Freedman and, if the answer to this question 
is yes then $* would be an example of a 4-manifold 
with no exotic smooth structures. There is at present 
no consensus on the answer to this question. 


Exotic Structures on Open 4-Manifolds 


If M is an open manifold, that is, a noncompact 
manifold without boundary, and M — R" then, for 


n X 4, there is only one smooth structure; but for 
n=4, there are exotic differentiable structures on 
R*. In fact, Gompf showed that there is a continuum 
of exotic differentiable structures that can be placed 
on R’. 


Symplectic and Kahler 4-Manifolds 


Many 4-manifolds are symplectic, and symplectic 
manifolds are central in physics; there are many results 
obtained using Seiberg—Witten theory concerning the 
topology and geometry of symplectic manifolds. The 
exotic K3 structures referred to above are all symplec- 
tic and so there is no shortage of symplectic structures 
even within one homeomorphism class. Taubes 
obtained far-reaching new results for symplectic 
4-manifolds including establishing an equivalence 
between the Seiberg—Witten invariants in the symplec- 
tic case and the Gromov invariants. 

Kahler manifolds possess, simultaneously, com- 
patible, Riemannian, symplectic and complex struc- 
tures and, beginning with Witten's work, there are 
many results to be found for Kahler 4-manifolds 
using Seiberg-Witten techniques. 


4-Manifolds with Boundary 


There is a very important extension of the Donaldson- 
Seiberg-Witten theory to 4-manifolds M with bound- 
ary OM = N. When OM o, the Donaldson invariants 
are not numerical invariants but take values in HF(N) 
where HF(N) denotes what is called the Floer 
homology of the 3-manifold N. Topological quantum 
field theory is the ideal setting for this theory since it 
naturally treats manifolds with boundaries. The Floer 
homology groups HF(N) act as Hilbert spaces for the 
quantum fields defined on the boundary. There is now 
a full interplay of 4-manifold theory and 3-manifold 
theory as well as Yang-Mills theory in three and four 
dimensions. This interplay is often realized by taking 
two 4-manifolds M, and M with the same boundary 
N and joining them along N to obtain a closed 
4-manifold M so that 


M = M; Un M5 [56] 


Given a 3-manifold N, and an SU(2) connection 
A, Floer studied the critical points of the Chern- 
Simons function f(A) defined by 


E 1 
”82 


f(A) 人 tr(AAdA+3AAAAA) — [57] 
N 

where f(A) is regarded as a function on the infinite- 
dimensional space A of connections. The function f(A) 
changes by an integer under a gauge transformation 
and so descends to a single-valued gauge-invariant 


Four-Manifold Invariants and Physics 393 


function on the space of gauge orbits A/G if one 
considers exp (27kif(A)) where k € Z (G being the 
group of gauge transformations). Morse theory 
applied to this infinite-dimensional setting gives an 
infinite Morse index to each critical point, a pathology 
which is avoided by only defining the difference of the 
index between two critical points using spectral flow. 
The critical points correspond, via gradient flow and a 
consideration of the instanton equations 


on the 4-manifold N x R, to the flat connections on 
the 3-manifold N. The latter are identifiable as the 
set of (equivalence classes) of representations of the 
fundamental group 71(M) in the gauge group SU(2), 
that is, with 


Hom(zi(N), SU(2))/Ad SU(2) [59] 


For the Seiberg-Witten formulation, let A denote a 
connection on the 3-manifold N with curvature 
F(A). Then the Chern-Simons function f(A) is 
replaced by the abelian Chern-Simons function 
together with a quadratic fermion term resulting in 
the function f°” (A), defined by 


fY (Â) = f (obo + AAFA)) [60] 


where D; denotes the self-adjoint Dirac operator in 
three dimensions acting on a spinor ó on N; because 
of the presence of the Chern-Simons function 
f^" (A) is only defined up to a multiple of 87? in a 
manner similar to the case for f(A). Gradient flow 
together with the Seiberg-Witten equations on the 
4-manifold N x R result in critical points corre- 
sponding to the solutions to 


Pio = 0, 


which is a three-dimensional version of the Seiberg- 
Witten equations. 

The critical point theory of these two functions f(A) 
and f/?V(A) permit the construction of the instanton 
Floer homology groups HF'"*(N) and HF?"(N), 
respectively. In fact, there are several kinds of Floer 
homology: Lagrangian Floer homology, instanton 
Floer homology, Heegard-Floer homology, Seiberg- 
Witten-Floer homology and conjectures concerning 
their relations to one another. 

There are still many unanswered questions of 
joint interest to mathematicians and physicists in the 
entire area of 4-manifold theory. 


F(A) = -Jo [61] 


See also: Electric-Magnetic Duality; Gauge Theoretic 
Invariants of 4-Manifolds; Floer Homology; Topological 
Quantum Field Theory: Overview. 
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Introduction 


Since the 1970s, dimension theory for dynamics has 
evolved into an independent field of mathematics. 
Its main goal is to measure complexity of invariant 
sets and measures using fractal dimensions. The 
history of fractal dimensions is closely related to 
the names of H Minkowski (Minkowski content, 
1903, H Hausdorff (Hausdorff dimension, 
1919), G Bouligand (Bouligand dimension, 1928), 
LS Pontryagin and LG Schnirelmann (metric order, 
1932), P Moran (Moran geometric constructions, 
1946), AS Besicovitch and SJ Taylor (Besicovitch- 
Taylor index, 1954), A Rényi (Rényi spectrum 
for dimensions, 1957) AN Kolmogorov and 
VM Tihomirov (metric dimension, Kolmogorov 
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complexity, 1959), YaG Sinai, D Ruelle, R Bowen 
(thermodynamic formalism, Bowen’s equation, 
1972, 1973, 1979), B Mandelbrot (fractals and 
multifractals, 1974), JL Kaplan and JA Yorke 
(Lyapunov dimension, 1979), JE Hutchinson (frac- 
tals and self-similarity, 1981), C Tricot, D Sullivan 
(packing dimension, 1982, 1984), HGE Hentschel 
and I Procaccia (Hentschel-Procaccia spectrum for 
dimensions, 1983), Ya Pesin (Carathéodory-Pesin 
dimension, 1988), M Lapidus and M van Franken- 
huysen (complex dimensions for fractal strings, 
2000), etc. Fractal dimensions enable us to have a 
better insight into the dynamics appearing in various 
problems in physics, engineering, chemistry, medi- 
cine, geology, meteorology, ecology, economics, 
computer science, image processing, and, of course, 
in many branches of mathematics. Concentrating on 
box and Hausdorff dimensions only, we describe 
basic methods of fractal analysis in dynamics, sketch 
their applications, and indicate some trends in this 
rapidly growing field. 


Fractal Dimensions 
Box Dimensions 


Let A be a bounded set in R, and let d(x, A) be 
Euclidean distance from x to A. The Minkowski 
sausage of radius € around A (a term coined by 
B Mandelbrot) is defined as e-neighborhood of A, 
that is, A-:={yeR™: d(y, A) <€}. By the upper 
s-dimensional Minkowski content of A, s>0, 
we mean 


*S os EUR |A.| 
Here |-| denotes N-dimensional Lebesgue measure. 
The corresponding upper box dimension is defined by 


dimgA := inf{s > 0: M*(A) = 0} 


The lower s-dimensional Minkowski content M: (A) 
and the corresponding lower box dimension dimpA 
are defined analogously. The name of box dimen- 
sion stems from the following: if we have an e-grid 
in R composed of closed N-dimensional boxes 
with side £, and if N(A, £) is the number of boxes of 
the grid intersecting A, then 
-一 一 -一 ]ogN(A,s) 
xr deben Mor n 

and analogously for dimg A. It suffices to take any 
geometric subsequence £p — b^* in the limit, where 
b > 1 (H Furstenberg, 1970). There are many other 
names for the upper box dimension appearing in the 
literature, like the Cantor-Minkowski order, Min- 
kowski dimension, Bouligand dimension, Borel 
logarithmic rarefaction, Besicovitch-Taylor index, 
entropy dimension, Kolmogorov dimension, fractal 
dimension, capacity dimension, and limit capacity. 
If A is such that dim, A = dimpA, the common value 
is denoted by d:= dimgA, and we call it the box 
dimension of A. If, in addition to this, both M4(A) 
and M*“(A) are in (0,00), we say that A is 
Minkowski nondegenerate. If, moreover, M(A) = 
M*4(A) 2: M4(A) E€ (0,00), then A is said to be 
Minkowski measurable. 

Assume that A is such that d:— dimgA and 
M (A) exist. Then the value of M^(A)" is called 
the lacunarity of A (B Mandelbrot, 1982). A 
bounded set A C R is said to be porous (A Denjoy, 
1920) if there exist a » 0 and 6>0 such that 
for every x € A and re(0,6) there is y € RN such 
that the open ball B,,(y) is contained in B,(x)\ A. 
If A is porous then it is easy to see that dimgA < N 
(O Martio and M Vuorinen, 1987, A Salli, 1991). 

We proceed with two examples. Let 
A:— C^, 4€(0,1/2), be the Cantor set obtained 
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Figure 1 Spirals of equal box dimensions (4/3) and different 
lacunarities (0.43 and 0.05). 


from [0,1] by consecutive deletion of 2* middle 
open intervals of length a^(1— 2a) in step RE NU 
(0). Then dimg A = (log 2)/( log (1/a)) (G Bouligand, 
1928), and A is nondegenerate, but not Minkowski 
measurable (Lapidus and Pomerance, 1993). For 
the spiral T of focus type defined by r=my™ in 
polar coordinates, where a€(0,1) and m > 0 are 
fixed, 过 pl > 0， we have dimgT —2/(1 +a) 
(Y Dupain, M Mendés-France, C Tricot, 1983). It 
is Minkowski measurable (Žubrinić and Županović, 
2005), and the larger m, the smaller the lacunarity; 
see Figure 1. 


Hausdorff Dimension 


For a given subset A of R (not necessarily 
bounded) and s> 0 we define H*(A):= lim. 0 
inf (377, r5] € [0,00], where the infimum is taken 
over all finite or countable coverings of A by open 
balls of radii r; < e. The value of H*(A) is called 
s-dimensional Hausdorff outer measure of A. The 
Hausdorff dimension of A, sometimes called the 
Hausdorff-Besicovitch dimension, is defined by 


dimy A := inf{s > 0: H*(A) = 0} 


If A is bounded then dimy A € dimp A < dimgA < N. 

We say that A is Hausdorff nondegenerate (or d-set) 
if HŻ(A) € (0,oc) for some d > 0. Cantor sets share 
this property, and dimg C^ — (log 2)/( log (1/a)), 
where a € (0, 1/2) (Hausdorff, 1919). 


Gauge Functions 


The notions of Minkowski contents and Hausdorff 
measure can be generalized using gauge functions 
b:[0,29)— R that are assumed to be continuous, 
increasing, and 5(0) — 0. For example, 


* A_| 
Y i IA] 
MA): am oN h(e) 


and similarly for MP (A) (M Lapidus and C He, 
1997), while for ^ (A) it suffices to change r? with 
b(r;) in the above definition of the Hausdorff outer 
measure (Besicovitch, 1934). Gauge functions are 
used for sets that are Minkowski or Hausdorff 
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degenerate. The aim, if possible, is to find an explicit 
gauge function so that the corresponding generalized 
Minkowski contents or Hausdorff measure of A be 
nondegenerate. 


Methods of Fractal Analysis in Dynamics 
Thermodynamic Formalism 


Thermodynamic formalism has been developed by 
Sinai (1972), Ruelle (1973), and Bowen (1975), 
using methods of statistical mechanics in order to 
study dynamics and to find dimensions of various 
fractal sets. We first describe a “dictionary” for 
explicit geometric constructions of Cantor-like sets. 
Let X, be the set of all sequences i= (i, i2,...) of 
elements i from a given set of p symbols, say 
{1,2,...,p}. We endow X, with the metric d(i, j) :— 
35,2 *|i,—j,| and introduce the one-sided shift 
operator (or left shift) o0: X,— X, defined by 
(c(1)),, a ints that Is, 0 (11,12, 135055 | = (12, 13,14, ... ). 
A set Oc X, is called the symbolic dynamics if it is 
compact and o-invariant, that is, c(O) c Q. Hence, 
(O,c) is a symbolic dynamical system. Denote 
in|:—(,...,4,) Given a continuous function 
Q:Q — R, let us define the topological pressure of 
i with respect to o by 


XO ECG]) 


Ulnie Q} 


Fgh cies 


n—oo7 


n—1 


Epi) := zil ii D) 


U € O: j|n|=i|n|} k=0 


The topological entropy of c|Q is defined by 
b(c|Q) :— P(0), that is, 


b(o\Q) = lim log ilu: ie Q) 


where # denotes the cardinal number of a set. The 
above function w := ye v^ has the property 
Dnim — n + Ym 9 0", and therefore we speak about 
additive thermodynamic formalism. Topological pres- 
sure was introduced by D Ruelle (1973) and extended 
by P Walters (1976). Bowen's equation (1979) has a 
very important role in the computation of the 
Hausdorff dimension of various sets. For the unknown 
sER, and with a suitably chosen function y, this 
equation reads 


P(sp) =0 


Geometric Constructions 


A geometric construction (O, A) in R" indexed by 
symbolic dynamics Q is a family A of compact sets 


AQ (@A 


A11 A2 Ao A22 
Figure 2 Cantor-like set. 


Ain) CR”, 1E Q,n EN, such that diamAj,; — 0 as 
fi = Cs Aitai] Ge AN TR Ais] = intA jn) for every ic O 
and all n, and intAijn MintAj,;=0 whenever 
iln] Æ jln] (Moran's open set condition). This family 
induces the Cantor-like set 


F=() (U a 


n=] NEO 


(see Figure 2). The mapping 5h: Q— F defined by 
b(i):— D. , Ain is called the coding map of F. The 
above geometric construction includes well-known 
iterated function systems of similarities as a special 
case. If 1,,..., Àp are given numbers in (0, 1), and Ai 四 
are balls of radii rij := Aj, ... 4;,, then s:— dimy F is 
the unique solution of Bowen's equation P(sy) — 0, 
where w is defined by (i):— log A; (Ya Pesin and 
H Weiss, 1996). In this case Bowen's equation is 
equivalent to Moran's equation (1946), 


» jf sd 
k=1 


This result has been generalized by L Barreira (1996) 
using the Carathéodory-Pesin construction (1988). 
Let us illustrate Barreira’s theory of nonadditive 
thermodynamic formalism with a special case. 
Assume that (O, A) is a geometric construction for 
which the sets Ai 由 are balls, and let there exist 6 > 0 
such that Tint] 之 6 . Ti[n] and l'ibi--m] < Tilna” lmn] for 
all 2€ Q,n,m EN. Then dimy F = dimg F= s, where 
s is the unique real number such that 


NO rip =0 [1] 


Uln]ieo) 


lim Si 


noon 


This is a special case of Barreira's extension of 
Bowen's equation to nonadditive thermodynamic 
formalism. Moran's equation can be deduced from 
[1] by defining rij :— Aj ... A;,, where i= (4,15, ...), 
and A1,...,A, € (0, 1) are given numbers. Pesin and 
Weiss (1996) showed that Moran's open set condi- 
tion can be weakened so that partial intersections of 
interiors of pairs of basic sets in the family A are 
allowed. Thermodynamic formalism has been used 
to study the Hausdorff dimension of Julia sets 


(Ruelle, 1982), horseshoes (H McCluskey and 
A Manning, 1983), etc. 

An important example of symbolic dynamics is the 
topological Markov chain X4 generated by a p x p 
matrix A with entries aj; € (0, 1}: 


XA * == {i = (1,12, "T » € Xp: i.i, 1 Si for all keN) 


It is a compact, o-invariant subset of X,. The map 
c|XA is called the subshift of finite type (Bowen, 
1975). A construction of Cantor-like set F using 
dynamics O= X, is called a simple geometric con- 
struction, while a geometric construction is said to be a 
Markov geometric construction if OQ — X4. If F is 
obtained by a Markov geometric construction such 
that all Aim are balls of radii fiin) := Aj, ... Aip, where 
^; €(0,1),75€[1,...,p], then dimgF= dimyF=s, 
where s is the unique solution of equation 
p(AM;)=1. Here 4M;:—diag(^i,...,A,) and 
p( AM,) is the spectral radius of the matrix AM,. This 
and more general results have been obtained by Pesin 
and Weiss (1996). 

Any Cantor-like set F obtained via iterated 
function system of similarities satisfying Moran's 
open set condition is Hausdorff nondegenerate 
(Moran, 1946). If F is of nonlattice type, that is, 
the set (log A1,..., log Ap} is not contained in r- Z 
for any r0, then F is Minkowski measurable 
(D Gatzouras, 1999). 


Hyperbolic Measures 


Let X be a complete metric space and assume that 
f: X— X is continuous. Let u be an f-invariant Borel 
probability measure on X (ie, pg(f (A) -—g(A) 
for measurable sets A) with a compact support. 
The Hausdorff dimension of u, and the lower and 
upper box dimensions of jy, (L-S Young, 1982) are 
defined by 


dimH := inf{dimy, Z : ZC X, u(Z) = 1) 
dim y= lim inf(dimgZ : ZcX,n(Z) & 1—45) 


dimpy := lim inf{dimgZ : Z CX,n(Z) > 1— 6) 


It is natural to introduce the lower and upper 
pointwise dimensions of jj at x € X by 


ini log p(B,(x)) 
r0 log r 


d(x) := 


and similarly d,,(x). It has been shown by Young 
(1982) that if X has finite topological dimension and 
if u is exact dimensional, that is, d, (x) = d,(x) =: d 
for pi-a.e. x € X, then 


dimg y= dimg = d 
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She also proved that hyperbolic measures (ergodic 
measures with nonzero Lyapunov exponents), invar- 
iant under a C! *?-diffeomorphism, a > 0, are exact 
dimensional. F Ledrappier (1986) derived exact 
dimensionality for hyperbolic Bowen-Ruelle-Sinai 
measures. This result was extended by Ya Pesin and 
Ch Yue (1996) to hyperbolic measures with semilocal 
product structure. J-P Eckmann and D Ruelle (1985) 
conjectured that the exact dimensionality holds for 
general hyperbolic measures, and this was proved by 
Barreira, Pesin, and Schmeling (1996). More precisely, 
if f is a C'*°-diffeomorphism on a smooth Riemann 
manifold X without boundary, and if y is f-invariant, 
compactly supported Borel probability measure, then 
its hyperbolicity implies that 


d, (x) — d,(x) = d; (x) + d, (x) 


for j-a.e. x € X, where d^ (x) and d'(x) are stable 
and unstable pointwise dimensions of u at x 
introduced by Ledrappier and Young (1985). 


Multifractal Analysis of Functions and Measures 


Invariant sets of many dynamical systems are not 
self-similar. Roughly speaking, the aim of multi- 
fractal analysis is to make a decomposition of the 
invariant set with respect to desired fractal proper- 
ties and then to study a fractal dimension of each 
set of the decomposition. Some dynamical systems 
have invariant sets equal to graphs of Hólderian 
functions f : RF — R, so that wavelet methods can 
be used. One of the goals of multifractal analysis 
of functions is to study the spectrum of singularities 


of f defined by 
dr(a) :二 dimyH, (f) 


introduced by U Frisch and G Parisi (1985) in the 
context of fully developed turbulence. Here Ha(f) is 
the set of points at which the corresponding 
pointwise Holder exponent of f is equal to a > 0. 
If the function f is self-similar then d;(o) is real 
analytic and strictly concave (first increasing and 
then decreasing) on an explicit interval (a,a) 
(S Jaffard, 1997). It is natural to consider the set 
Ca s(f) of points xo called chirps of order (a, 5) 
(Y Meyer 1996), at which f behaves roughly 
like |x — xo|^ sin (1/|x — xol”), 8 > 0. The function 
Dr(a, 8) :— dimy Ca, 5(f) is called the chirp spectrum 
of f (S Jaffard 2000). Wavelet methods have found 
applications in the study of evolution equations 
and in modeling and detection of chirps in 
turbulent flows (S Jaffard, Y Meyer, RD Robert, 
2001). 

Basic ideas of multifractal analysis have been 
introduced by physicists T Halsey, MH Jensen, 
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LP Kadanoff, I Procaccia, and BI Shraiman (1988). 
In applications it often deals with an invariant 
ergodic probability measure associated with the 
dynamical system considered. Multifractal analysis 
of a Borel finite measure p defined on RN consists in 
the study of the function 


d,(a):— dimuK4(u), a> 0 


called the spectrum of pointwise dimensions of p. 
Here K,(1) is the set of points where the pointwise 
dimension of y is equal to a: 


Ka(u) = {x € R^: d, (x) = d, (x) = a} 


It is also of interest to study the Hausdorff 
dimension of irregular set K(x) :— [x c RA d, (x) « 
d,,(x)}. These sets are pairwise disjoint and consti- 
tute a multifractal decomposition of R^, that is, 


R^ —K(u) U (Us en Ka(u)) 


The function d,,(a@) provides an important informa- 
tion about the complexity of multifractal decom- 
position. In many situations, there is an open 
interval (a,@) on which the function d,(a) is 
analytic and strictly concave (first increasing and 
then decreasing), and equal to the Legendre trans- 
form of an explicit convex function. We thus obtain 
an uncountable family of sets K,(j) with positive 
Hausdorff dimension, which shows enormous com- 
plexity of the multifractal decomposition of R^: 
These and related questions have been studied by 
L Olsen (1995), K Falconer (1996), Pesin and Weiss 
(1996), Barreira and Schmeling (2000), and many 
other authors. 


Local Lyapunov Dimension 


Let Q be an open set in R and let f: Q— RN be a 
C'-map. To any fixed x € Q we assign N singular 
values a; > a2 > --- > an > 0 of f, defined as square 
roots of eigenvalues of the matrix /f'(x)!. f(x), 
where f'(x) is the Jacobian of f at x, and f'(x)' its 
transpose. The local Lyapunov dimension of f at x is 


defined by 
dim, (f, x) :一 /十 5 


where j is the largest integer in [0, N] such that 
d1:::dj > 1 (if there is no such j we let j=0), and 
sC[0,1) is the unique solution of ay:++ aja}, —1 
(except for j=N, when we define s=0). This 
definition, due to B R Hunt (1996), is close to that of 
Kaplan and Yorke (1979). The Jacobian f'(x) con- 
tracts k-dimensional volumes (that is, a; --- a, < 1) if 
and only if dim; (f, x) < k. In this case, we say that f is 
k-contracting at x. Furthermore, the function 
x dim; (f,x) is upper-semicontinuous, so that for 


any compact subset A of Q the Lyapunov dimension 
of fon A, 


dim; (f, A) :— max dim; (f, x) 
XE 


is well defined. Yu S Ilyashenko conjectured that if f 
locally contracts k-dimensional volumes then the 
upper box dimension of any compact invariant set is 
<k. Hunt (1996) proved that if A is a compact, 
strictly invariant set of f (i.e., f(A) =A) then 


dimpA < dim; (f, A) [2] 


This is\an improvement of dimyA < dimi(f, A) 
obtained by A Douady and J Oesterlé (1980), and 
independently by Ilyashenko (1982). MA Blinchevs- 
kaya and Yu S Ilyashenko (1999) proved that if A is any 
attractor of a smooth map in a Hilbert space that 
contracts k-dimensional volumes then dimpA < k. See 
[3] below. 

A continuous variant of this method is used in 
order to obtain estimates of fractal dimensions of 
global attractors of dynamical systems (X,S) on a 
Hilbert space X. Here S(t),¢ > 0, is a semigroup of 
continuous operators on X, that is, S(t + s) = S(t)S(s) 
and S$(0) — I. A set A in X is called a global attractor 
of dynamical system if it is compact, attracting 
(i.e., for any bounded set B and € > 0 there exists to 
such that for t > tọ we have S(t)B c A.), and A is 
strictly invariant (i.e., S(t)A =A for all t > 0). 


Applications in Dynamics 
Logistic Map 


M Feigenbaum, a mathematical physicist, intro- 
duced and studied the dynamics of the logistic map 
fi:[0,1] —5 [0, 1], (x) :2- Ax(1 —x), A€ (0,4]- Tak- 
ing À—A5,223.570 the corresponding invariant set 
A C [0, 1] (i.e., $4(A)U S;(A) =A, where S; are two 
branches of f;!) has both Hausdorff and box 
dimensions equal to 720.538 (P Grassberger 1981, 
P Grassberger and I Procaccia, 1983). The set A 
has Cantor-like structure, but is not self-similar. 
Its multifractal properties have been studied by 
U Frisch, K Khanin, and T Matsumoto (2004). 


Smale Horseshoe 


In the early 1960s S Smale defined his famous 
horseshoe map and showed that it has a strange 
invariant set resulting in chaotic dynamics. The 
notion of strange attractor was introduced in 1971 
by Ruelle and Takens in their study of turbulence. 
Let S be a square in the plane and let f : R? > R? be 
a map transforming $ as indicated in Figure 3, such 
that on both components of SN f!(S) the map f is 


4 3 
1 2 
Saf-(S) 


Sa f" (S) ef-*(S) 
4" fS) 


FHAS) A F-"(S) ASO F(S)O F2(S) 


Figure 3 The Smale horseshoe. 


affine and preserves both horizontal and vertical 
directions, and such that points 1, 2, 3, and 4 are 
mapped to 1’,2’,3’, and 4'. Iterating f we get 
backward invariant set A_:= N% o f *(S), forward 
invariant set A, := NP o f'(S), and invariant set 
(horseshoe) Ay:=A,MA_. These sets have the 
Cantor set structure. More precisely, assuming that 
the contraction parameter of f in vertical direction is 
a€(0,1/2), and the expansion parameter in hor- 
izontal direction is b » 2, then A, — [0,1] x CW, 
where C? is the Cantor set, A — C'!/P x [0,1], and 
A, — C/P x C, so that dimgA, = dimyA, —1 + 
(log2)/(log(1/a)) and 


log 2. 


log(1/a) 
This is a special case of a general result about 
horseshoes in R? (not necessarily affine), due to 
McCluskey and Manning (1983), stated in terms of 
the pressure function. Analogous result as above can 
be obtained for Smale solenoids. In R? it is possible 
to construct affine horseshoes Ar such that dimi A; < 
dimpAr (M Pollicott and H Weiss, 1994). 

Smale discovered a connection between homoclinic 
orbits and the horseshoe map. It has been noticed that 


log 2. 


dimpgA; = dimy Ay = log b 
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fractal dimensions have important role in the study of 
homoclinic bifurcations of nonconservative dynamical 
systems. Since the 1970s the relationship between 
invariants of hyperbolic sets and the typical dynamics 
appearing in the unfolding of a homoclinic tangency 
by a parametrized family of surface diffeomorphisms 
has been studied by J Newhouse, J Palis, F Takens, 
J-C Yoccoz, CG Moreira and M Viana. The main 
result is that if the Hausdorff dimension of the 
hyperbolic set involved in the tangency is «1 then 
the parameter set where the hyperbolicity prevails has 
full Lebesgue density. If the Hausdorff dimension is 
>1, then hyperbolicity is not prevalent. This result and 
its proof were inspired by previous work of 
JM Marstrand (1954) about arithmetic differences of 
Cantor sets on the real line. According to the result by 
Moreira, Palis, and Viana (2001) the paradigm 
*hyperbolicity prevails if and only if the Hausdorff 
dimension is <1” extends to homoclinic bifurcations 
in any dimension. 

Using methods of thermodynamic formalism 
McCluskey and Manning (1983) proved that if f 
is the above horseshoe map, then there exists a 
C!-neighborhood U of f such that the mapping 
f — dimy Ay is continuous. Continuity of box and 
Hausdorff dimensions for horseshoes has been 
studied also by Takens, Palis, and Viana (1988). 


Lorenz Attractor 


EN Lorenz (1963), a meteorologist and student of 
G Birkhoff, showed by numerical experiments that 
for certain values of positive parameters o, r, b, the 
quadratic system 


* 


x—o(y — x), y=rx — y — XZ, z=xy 一 bz 


has the global attractor A, for example, for o = 10, 
r — 28,5 — 8/3. In this case dimg A z 2.06, which is 
a numerical result (Grassberger and Procaccia, 
1983). Using the analysis of local Lyapunov dimen- 
sion along the flow in A, G A Leonov (2001) showed 
that if o+1>b>2 and roe?(4 — b) 4- 2e(b — 1)x 
(2o — 3b) > b(b — 1)* then 


2(c 4- b 4- 1) 
(c — 1Y -- 4rc 


dimgA € 3— 
0 十 1 十 


Hénon Attractor 


M Hénon (1976), a theoretical astronomer, discovered 
the map f: R^— R? f (x, y) :— (a + by — x?,x), cap- 
turing several essential properties of the Lorenz 
system. In the case of a— 1.4 and b=0.3, Hunt 
(1996) derived from [2] that for any compact, strictly 
f-invariant set A in the trapping region [— 1.8, 1.8] 
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there holds dimgA < 1.5. Numerical experiments 
show that dimg A œ 1.28 (Grassberger, 1983). Assum- 
ing a > 0,b € (0, 1), and P. (x+, x+) € A, where P. are 
fixed points of f, Leonov (2001) obtained that 


1 


1 - Inb/In(/x^ + b — x ) 


xe 5 |b- 14 yb- D* +40 


The proof is based on the study of local Lyapunov 
dimension of f and its iterates on A. 


dimgA < 1+ 


Here 


Embedology 


The physical relevance of box dimensions in the 
study of attractors is related to the problem of 
finding the smallest possible dimension n sufficient 
to “embed” an attractor into R”. If ACR* is a 
compact set and if n > 2Zdim,A, then almost every 
map from R* into R”, in the sense of prevalence, is 
one-to-one on A and, moreover, it is an embedding 
on smooth manifolds contained in A (T Sauer, 
JA Yorke, and M Casdagli, 1991). If A is a strange 
attractor then the same is true for almost every 
delay-coordinate map from R^ to R”. This improves 
an earlier result by H Whitney (1936) and F Takens 
(Takens' embedology, 1981). The above notion of 
prevalence means the following: a property holds 
almost everywhere in the sense of prevalence if it 
holds on a subset S of the space V := = C! (R*, R”) for 
which there exists a finite-dimensional subspace 
EC V (probe space) such that for each ve V we 
have that v+e€S for Lebesgue a.e. e € E. 


Julia and Mandelbrot Sets 


M Shishikura (1998) proved that the boundary of the 
Mandelbrot set M generated by f(z) :— z? + c has the 
Hausdorff dimension equal to 2, thus answering 
positively to the conjecture by B Mandelbrot, 
J Milnor, and other mathematicians. Also for Julia 
sets there holds dma J(f.) 22 for generic c in M 
(i.e., on the set of second Baire category). The proof is 
based on the study of the bifurcation of parabolic 
periodic points. Also, each baby Mandelbrot set sitting 
inside of M has the boundary of Hausdorff dimension 
2 (L Tan, 1998). Shishikura's results hold for more 
general functions f(z) :一 好 + c, where d > 2. 

For Julia sets J(f.) generated by fel z):—2?--c 
there holds d(c):— dimy J(f.) — 1 + lc /(4 log 2) + 
o(lc?) for c— 0. This and more general results 
have been obtained by Ruelle (1982). He also 
proved that the function d(c) when restricted to the 
interval [0, oc) is real analytic in [0, 1/4) U (1/4, 00). 


Furthermore, it is left continuous at 1/4 (O Bodart 
and M Zinsmeister, 1996), but not continuous 
(A Douady, P Sentenac, and M Zinsmeister, 1997). 
Discontinuity of this map is related to the phenom- 
enon of parabolic implosion at c= 1/4. The deriva- 
tive d'(c) tends to 十 co from the left at c= 1/4 like 
(1/4 — c) /9-3? (G Havard and M Zinsmeister, 
2000). p» d(1/4)2:1.07, which is a numerical 
result. Analysis of dimensions is based on methods 
of thermodynamic formalism. 

C McMullen (1998) showed that if 0 is an 
irrational number of bounded type (i.e., its contin- 
ued fractional expansion [a1,42,...] is such that the 
sequence (a;) is bounded from above) and 
f(z) — z? +e7™z, then the Julia set J(f) is porous. 
In particular, dimpJ(f) < 2. Y C Yin (2000) showed 
that if all critical points in /(f) of a rational map 
f : C C are nonrecurrent (a point is nonrecurrent 
if it is not contained in its w-limit set) then J(f) is 
porous, hence dimg/(f) < 2. Urbanski and Przytycki 
(2001) described more general rational maps such 


that dimgJ(f) < 2. 


Spiral Trajectories 


A standard planar model where the Hopf-Takens 
bifurcation occurs is +=r(r” + 5; har”) = 1, 
where JEN. If T is a spiral tending to the limit 
cycle r=a of multiplicity m (i.e., r=a is a zero of 
order m of the right-hand side of the first equation 
in the system) then dimgT — 2 — 1/m. Furthermore, 
for m »1 the spiral is Minkowski measurable 
(Žubrinić and Županović, 2005). For m=1 the 
spiral is Minkowski nondegenerate with respect to 
the gauge function h(¢) :— &( log (1/2)) !. 


Infinite-Dimensional Dynamical Systems 


In many situations the dynamics of the global attractor 
A of the flow corresponding to an  auto- 
nomous Navier-Stokes system is finite-dimensional 
(Ladyzhenskaya, 1972). This means that there exists a 
positive integer N such that any trajectory in A is 
completely determined by its orthogonal projection 
onto an N-dimensional subspace of a Hilbert space X. 
The aim is to find estimates of box and Hausdorff 
dimensions of the global attractor, in order to under- 
stand some of the basic and challenging problems of 
turbulence theory. If A is a subset of a Hilbert space X, 
its Hausdorff dimension is defined analogously as for 
A C R^, The definition of the upper box dimension 
can be extended from A c R to 


log (A, £) 
log(1/e) 


dimgA :— lim. 0 


i3] 


where m(A,e) is the minimal number of balls 
sufficient to cover a given compact set A C X. The 
value of log (A, £) is called e-entropy of A. 

Folas and Temam (1979), Ladyzhenskaya (1982), 
AV Babin and MI Vishik (1982), Ruelle (1983), 
and E Lieb (1984) were among the first who 
obtained explicit upper bounds of Hausdorff and 
box dimensions of attractors of infinite-dimensional 
systems. For global attractors A associated with 
some classes of two-dimensional Navier-Stokes 
equations with nonhomogeneous boundary condi- 
tions it can be shown that dimgA < c1G + c; Re’, 
where G is the Grashof number, Re is the Reynolds 
number, and c; are positive constants (RM Brown, 
P A Perry, and Z Shen, 2000). VV Chepyzhov and 
AA Ilyin (2004) obtained that  dimgA < 
(1/V2z)0|0])! ^ G for equations with homoge- 
neous boundary conditions, where QCR? is a 
bounded domain, and A, is the first eigenvalue of 
—A. In the case of periodic boundary conditions 
Constantin, Foias, and Temam (1988) proved that 
dimgA < c1G??(1 + log G)?, while for a special 
class of external forces there holds dimHA > c; G? 
(V X Liu, 1993). Let us mention an open problem by 
VI Arnol'd: is it true that the Hausdorff dimension 
of any attracting set of the Navier-Stokes equation 
on two-dimensional torus is growing with the 
Reynolds number? 

In their study of partial regularity of solutions 
of three-dimensional Navier-Stokes equations, 
L Caffarelli, R Kohn, and L Nirenberg (1982) 
proved that the one-dimensional Hausdorff mea- 
sure in space and time (defined by parabolic 
cylinders) of the singular set of any “suitable” 
weak solution is equal to zero. A weak solution is 
said to be singular at a point (xo,to) if it is 
essentially unbounded in any of its neighborhoods. 
Dimensions of attractors of many other classes of 
partial differential equations (PDEs) have been 
studied, like for reaction-diffusion systems, wave 
equations with dissipation, complex Ginzburg- 
Landau equations, etc. Related questions for non- 
autonomous PDEs have been considered by V V 
Chepyzhov and MI Vishik since 1992. 


Probability 


Important examples of trajectories appearing in 
physics are provided by Brownian motions. Brow- 
nian motions w in R, N > 2, have paths w([0, 1]) of 
Hausdorff dimension 2. with probability 1, and they 
are almost surely Hausdorff degenerate, since 
^^(w([0,1])-—0 for ae. w (SJ Taylor, 1953). 
Defining gauge functions h(<):=e* log (1/&)x 
log log log (1/e) when N —2, and 5(&) :— €? log (1/2) 
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when N > 3, there holds H’(w({0,1])) € (0,oc) for 
a.e. w (D Ray, 1963, SJ Taylor, 1964). If N — 1 then 
a.e. w has the box and Hausdorff dimensions of 
the graph of w|)) ll equal to 3/2 (Taylor, 1953), and 
for the gauge function 5b(s):— €?" loglog(1/e) the 
corresponding generalized Hausdorff measure is 
nondegenerate. In the case of N > 2 we have the 
uniform dimension doubling property (R Kaufman, 
1969). This means that for a.e. Brownian motion w 
there holds dimyjyw(A)=2dimyA for all subsets 
A C [0, oc). There are also results concerning almost 
sure Hausdorff dimension of double, triple, and 
multiple points of a Brownian motion and of more 
general Lévy stable processes. 

Fractal dimensions also appear in the study of 
stochastic differential equations, like 


d 
dx, = Xo(x;) dt + 》 X,(xi) dés (t), xo 2 x eRN 
k=1 


The stochastic flow (x;),.9 in RM is driven by a 
Brownian motion (0(1)),.9 in R7. Let us assume that 
Xg, k —0,..., d, are C*-smooth T-periodic divergence- 
free vector fields on R. Then for almost every 
realization of the Brownian motion (6(t)),. 9, the set of 
initial points x generating the flow (x;),.9 with linear 
escape to infinity (i.e., lim,_,..(|x;|/t) > 0) is dense 
and of full Hausdorff dimension N (D Dolgopyat, 
V Kaloshin, and L Koralov, 2002). 


Other Directions 


There are many other fractal dimensions important 
for dynamics, like the Rényi spectrum for dimen- 
sions, correlation dimension, information dimen- 
sion, Hentschel-Procaccia spectrum for dimensions, 
packing dimension, and effective fractal dimension. 
Relations between dimension, entropy, Lyapunov 
exponents, Gibbs measures, and multifractal rigidity 
have been investigated by Pesin, Weiss, Barreira, 
Schmeling, etc. Fractal dimensions are used to study 
dynamics appearing in Kleinian groups (D Sullivan, 
CJ Bishop, P W Jones, C McMullen, BO Stratmann, 
etc.), quasiconformal mappings and quasiconfor- 
mal groups (FW Gehring, J Vaisala, K Astala, 
CJ Bishop, P Tukia, JW Anderson, P Bonfert-Taylor, 
EC Taylor, etc.), graph directed Markov systems 
(RD Mauldin, M Urbanski, etc.), random walks on 
fractal graphs (J Kigami, A Telcs, etc.), billiards 
(H Masur, Y Cheung, P Bálint, S Tabachnikov, 
N Chernov, D Szász, IP Toth, etc.), quantum 
dynamics (J-M Barbaroux, J-M Combes, H 
Schulz-Baldes, I Guarneri, etc.), quantum gravity 
(M Aizenman, A Aharony, ME Cates, T A Witten, 
GF Lawler, B Duplantier, etc.), harmonic analysis 
(RS Strichartz, ZM Balogh, JT Tyson, etc.), 
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number theory (L Barreira, M Pollicott, H Weiss, 
B Stratmann, B Saussol, etc.), Markov processes 
(RM Blumenthal, R Getoor, SJ Taylor, S Jaffard, 
C Tricot, Y Peres, Y Xiao, etc.), and theoretical 
computer science (B Ya Ryabko, L Staiger, 
JH Lutz, E Mayordomo, etc.), and so on. 


See also: Bifurcations of Periodic Orbits; Chaos and 
Attractors; Dissipative Dynamical Systems of Infinite 
Dimension; Dynamical Systems in Mathematical Physics: 
An Illustration from Water Waves; Ergodic Theory; 
Generic Properties of Dynamical Systems; Holomorphic 
Dynamics; Homoclinic Phenomena; Hyperbolic 
Dynamical Systems; Image Processing: Mathematics; 
Lyapunov Exponents and Strange Attractors; Partial 
Differential Equations: Some Examples; Polygonal 
Billiards; Quantum Ergodicity and Mixing of 
Eigenfunctions; Stochastic Differential Equations; 
Synchronization of Chaos; Universality and 
Renormalization; Wavelets: Applications; Wavelets: 
Mathematical Theory. 
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Introduction 


Interacting particles sometimes collectively behave 
in ways that take us by complete surprise. In a 
superfluid ^He atoms flow without viscosity, 
and in a superconductor electrons flow without 
resistance. Such behaviors announce emergent 
structures and principles which have often found 
applications in other areas. This article concerns 
the surprising collective effects that occur when 
electrons are confined in two dimensions and 
subjected to a strong transverse magnetic field. 
At low temperatures, the Hall resistance (defined 
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below) exhibits plateaus on which it is precisely 
quantized at 


Ry = fel [1] 


where / and e are fundamental constants and f is a 
plateau-specific rational fraction. This phenomenon 
is known as the “fractional quantum Hall effect” 
(FQHE), or, after its discoverers, the *Tsui-Stormer- 
Gossard” (TSG) effect. The underlying state provides 
a new paradigm for collective behavior in nature, and 
is understood in terms of a new class of quasiparticles 
known as “composite fermions,” which are topologi- 
cal bound states of electrons and quantized vortices. 
This article will outline the basics of the experimental 
phenomenology and our theoretical understanding of 
this effect. 


The Hall Effect 


The Ohm’s law, I= V/R, tells us that the current 
through a resistor is proportional to the applied 
voltage. The local form of the law is 


J=0E 2j 


where c is the conductivity, and J—qpv is the 
current density for particles of charge g and density 
p moving with a velocity v. 

In 1879, E H Hall discovered that in the presence of 
a crossed electric and magnetic fields (E and B), 
the current flows in a direction *perpendicular" to the 
plane containing the two fields. Alternatively, the 
passage of current induces a voltage perpendicular to 
the direction of the current flow. This is known as 
the Hall effect (see Figure 1). The phenomenon has a 
classical origin. A consequence of the Lorentz force 
law of electrodynamics, 


F=q(EtzvxB) [3] 


which gives the force on a particle of charge q 
moving with a velocity v, is that for crossed electric 
and magnetic fields the particle drifts in the 
direction Ex B with a velocity v —cE/B. The 
current density is therefore given by J — qpv, where 
p is the (three-dimensional) density of particles. That 
produces the Hall resistivity 


E, B 
LX |4] 


The von Klitzing Effect 


Molecular beam epitaxy allows controllable layer 
by layer growth in which one type of semiconductor, 
say GaAs, can be grown on top of another, say 
Al,,Ga;_,As, to produce an atomically sharp interface. 
By appropriately doping such structures, electrons can 


Figure 1 Schematics of magnetotransport measurement. /, V, , 
and V4 are the current, longitudinal voltage, and the Hall 
voltage, respectively. The longitudinal and Hall resistances are 
defined as Fi = V, // and Ry = Vy/l. 
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be captured at the interface, thus producing a two- 
dimensional electron system (2DES). We note that 
these are three-dimensional electrons confined to move 
in two dimensions. The interaction has the standard 
Coulomb form V(r) — e? /er, where c is the dielectric 
constant of the host material. (In a hypothetical world 
which has only two space dimensions, the interaction 
would be logarithmic.) 

The “integral quantum Hall effect" (IQHE) or the 
“von Klitzing effect" was discovered unexpectedly 
by von Klitzing and collaborators in 1980, in their 
study of Hall effect in a 2DES. In two dimensions, 
one defines the Hall resistance as 

Ry = TH [5] 
which, from classical electrodynamics, is expected to 
be proportional to the magnetic field B. That is indeed 
the case at small magnetic fields. At sufficiently high B, 
however, quantum mechanical effects appear in a 
dramatic manner. The essential observations are as 
follows. 


1. When plotted as a function of the magnetic field 
B, the Hall resistance exhibits numerous plateaus. On 
any given plateau, Ry is precisely quantized with 
values given by 


b 
NH ES 


[6] 


where n is an integer (hence the name “integral 
quantum Hall effect”). The plateau occurs in the 
vicinity of v = Be/pbc =n, where v is the “filling 
factor” (defined below). 

2. In the plateau region, the longitudinal resis- 
tance exhibits an Arrhenius behavior: 


A 
RI n^ CXP (- zr) [7] 


This gives a filling-factor dependent energy scale A, 
which indicates the presence of a gap in the 
excitation spectrum. R; vanishes in the limit T — 0. 


The absolute accuracy of the quantization has 
been established to a few parts in 10? for lo 
uncertainty, and the relative accuracy to a few 
parts in 10!?, There is presently no known 
“intrinsic” correction to the quantization. Perhaps, 
the most remarkable aspect of the effect is its 
universality. It is independent of the sample type, 
geometry, various material parameters (the band 
mass of the electron or the dielectric constant of the 
semiconductor), and disorder. The combination 
b/e? also occurs in the definition of the fine 
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structure constant a — e? /bc, the value of which is 
approximately 1/137. The Hall effect measure- 
ments in dirty, solid state systems thus provide 
one of the most accurate values for a. Finally, the 
lack of resistance at T=0 is to be contrasted with 
ordinary metals, for which the resistance at T — 0, 
called the residual resistance, is finite and propor- 
tional to disorder. 


The TSG Effect 


The next revolution occurred in 1982 with the 
discovery of the TSG effect, that is, plateaus on 
which the Hall resistance is quantized at values 
given by eqn [1] (see Figure 2). The observation 
of the Ru —h/fe? plateau is often referred to as 
the observation of the fraction f. Improvement 
of experimental conditions has led to the observa- 
tion of a large number of fractions over the 
years, revealing the richness of the TSG effect. 
At the time of the writing of this article, the 
number of observed fractions is more than 50 if 
one counts only fractions below unity. As in the 
von Klitzing effect, the longitudinal resistance 
exhibits an Arrhenius behavior, vanishing in the 
limit T — 0. 


e 
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Landau Levels 


The Hamiltonian for a nonrelativistic electron 
moving in two space dimensions in a perpendicular 
magnetic field is given by 


1 eA\* 
Hoa (P+) " 


Here, m, is the electron’s band mass and —e its 
charge. For a uniform magnetic field, the vector 
potential A satisfies 


V xA - B£ (9) 


Because A is a linear function of the spatial 
coordinates, it follows that H is a generalized two- 
dimensional harmonic oscillator Hamiltonian which 
is quadratic in both the spatial coordinates and in 
the canonical momentum p= -—ihV, and therefore 
can be diagonalized exactly. 

A convenient gauge choice is the symmetric gauge: 


Bxr B 
A= x 7 (—Y, xX, 0) [10] 


With the magnetic length €=,/hc/eB and the 


cyclotron energy hw.=heB/m,c chosen as the 


1/3 


20 30 


Magnetic field (T) 


Figure 2 The TSG effect. The Hall resistance (Ry) exhibits many precisely quantized plateaus, concurrent with minima in the 
longitudinal resistance (H). Reproduced with permission from Perspectives in Quantum Hall Effects; HL Stormer and DC Tsui; 
SD Sarma and A Pinczuk (eds.); Copyright © 1997, Wiley. Reprinted with permission of John Wiley & Sons, Inc. 


units for length and energy, the Hamiltonian can be 
expressed as 


Choosing as independent variables 
Z= x — iy, Z=x+I1y [12] 


we get 


1 o? | ð ð 
H = (^ ge ataa] [13] 


Now define the following sets of ladder operators: 
1-72 0 
b= —| -+2— 14 
V2 G " 3 e 
[15] 


| 
i= (3-25) [16] 


/2,\2 Oz 
1 fz ð 


which have the property that 
[b,b] — 1 [18] 


and all the other commutators are zero. In terms of 
these operators, the Hamiltonian can be written as 


(a, a'| = 1, 


H — ala 1 [19] 


The eigenvalue of aa is an integer, n, called the 
Landau level (LL) index. The z-component of the 
canonical angular momentum operator, the only 
relevant component for the two-dimensional prob- 
lem, is defined as 


; Q0 0 0 
i pb [20] 
Exploiting the property [H, Lz] = 0, the eigenfunctions 
will be chosen to diagonalize H and L, simultaneously. 
The eigenvalue of L, will be denoted by —m. The 
analogy to the Harmonic oscillator problem immedi- 
ately gives the solution 


H |m, n) = E,|m,n) [21] 


E os 


where 


E. = (n + 5) [22] 


Fractional Quantum Hall Effect 405 


and 


ji n) B (py (ai)” 
"Vm+n)! vn! 
where m = —n, —n+1,.... The single-particle orbital 


at the bottom of the two ladders defined by the two sets 
of raising and lowering operators is 


|10, 0) [23] 


which satisfies 
a\0,0) = b|0,0)= 0 [25] 


The single-particle states are particularly simple in 
the lowest Landau level (n = 0): 


= jAp. 
zm e 一 22/44 


Nom(r) = (r|0, m) = VI Imml 


Aside from the ubiquitous Gaussian factor, a general 
state in the lowest Landau level is given by a 
polynomial of z; it does not involve any z. In other 
words, apart from the Gaussian factor, the lowest 
Landau level wave functions are analytic functions of z. 


126| 


Landau Level Degeneracy 


The state 7o,m(r) is peaked strongly at r= 2m (f. 
Neglecting order-1 effects, there are m states in the 
lowest Landau level in a disk of radius r= /2m £, 
giving a degeneracy of (2x02)! per unit area per 
Landau level. (The same degeneracy is obtained for 
higher Landau levels as well.) It is equal to B/6o, 
where ġo — bc/e is called the flux quantum, that is, 
there is one state per flux quantum in each Landau 
level. 


Filling Factor 
The number of filled Landau levels, called the filling 


factor, is given by 


Po 


y —p2nt^ = = [27] 


The Origin of Plateaus 


The von Klitzing effect can be explained in terms of 
a model which neglects the interactions between 
electrons. It occurs because the ground state at an 
integral filling is unique and nondegenerate, sepa- 
rated from excitations by a gap. Laughlin (1981) 
showed that the disorder-induced Anderson locali- 
zation also plays a crucial role in the establishment 
of the Hall plateaus. To see this, imagine changing 
the filling away from an integer by adding some 
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electrons or holes. In a perfect system, the additional 
particles would also be free to carry current, but in 
the actual, disordered sample, they are immobilized 
by impurities (which create localized states in the 
energy gap), and do not contribute to transport. The 
transport properties therefore remain unaffected as 
the filling factor is varied slightly away from an 
integer, and the system continues to behave as 


though it had filled shells. 


The Lowest Landau Level Problem 


The TSG effect arises due to interelectron interaction. 
We wish to obtain solutions for the Schródinger 
equation 


HW = EY [28] 


at an arbitrary filling v, where 


H= "E Evi ZONE X 29| 


j<k 


The first term on the right-hand side is the kinetic 
energy in the presence of a constant external 
magnetic field B=V x A, and the second term is 
the Coulomb interaction energy. (The Zeeman 
energy is not included explicitly because we con- 
sider, for now, magnetic fields that are sufficiently 
high that only fully spin-polarized states are 
relevant.) It is convenient to consider the limit 
(e? /el)/(bw.) 一 0, when the Coulomb interaction is 
so weak that it is not able to cause Landau level 
mixing, so electrons can be taken to be within the 
lowest Landau level. The kinetic energy then is an 
irrelevant constant which can be thrown away, and 
the Hamiltonian reduces to 


H = Piu — a = J u [30] 


“which must be solved with the lowest LL restric- 
tion,” as explicitly indicated by the lowest LL 
projection operator Pirr. The problem is thus 
mathematically well defined, but requires degenerate 
perturbation theory in an enormously large Hilbert 
space, with (Y^) many particle basis vectors. The 
usual perturbative techniques are not useful due to 
the absence of a small parameter in the problem; 
e?/c( merely sets the energy scale in the lowest 
Landau level. 


Composite-Fermion Theory 


Inspired by the qualitative similarity between the 
integral and the fractional Hall effects, the composite- 


fermion (CF) theory (Jain 1989) postulates that the 
eigenfunctions of interacting electrons at filling factor 
v, V,, are related to the (known) eigenfunctions of 
noninteracting electrons at filling factor v*, ®,., accord- 
ing to 


V, = Pur, | [Gs ^ 4)" [31] 
j<k 
where Pirr denotes projection of the wave function 


on its right into the lowest Landau level. The filling 
factors are related by 


* 


V 


EZES an 


which can be seen as follows: the largest power of zi 
in $,. (neglecting order-one corrections) is N/v*, as 
follows from the definition of the filling factor. The 
largest power of zl on the right-hand side is 
therefore pN(N — 1) 4- N/v*. This is the number of 
flux quanta penetrating the *sample." Dividing it by 
N and taking the limit N — oo gives the inverse of 
the filling factor v^!. These wave functions are now 
known to capture the correct nonperturbative 
physics of the TSG effect (see below), and also 
to provide extremely accurate representations for 
the actual correlated ground states and their excita- 
tions. They recover Laughlin's 1983 wave function 
for the ground state at v = 1/(2p + 1), while clarify- 
ing that it is a part of a much bigger conceptual 
structure. 


Physical Interpretation 


The crucial property of the wave function in eqn [31] 
is that the complex Jastrow factor |[;., (z; — zp)? 
binds 2p vortices on each electron. More precisely, 
each electron sees 2p vortices on every other electron, 
in that a complete loop of an electron around any other 
electron produces a phase of 27 x 2p. The bound state 
is interpreted as a particle, called the “composite 
fermion.” Because the vortex is a topological object, so 
is the composite fermion. The vorticity 2p is quantized 
to be an even integer, as required by the single- 
valuedness and antisymmetry requirements of quan- 
tum mechanics, which will be seen to lie at the root of 
the exact quantization of the Hall resistance. 

When composite fermions move about, they experi- 
ence, in addition to the Aharonov-Bohm (AB) phase, 
also the Berry phases coming from vortices on other 
composite fermions. Imagine taking a composite 
fermion in a closed loop enclosing an area A. The 
phase associated with that loop is given by 


A 
0 


where Nene is the number of composite fermions 
inside the loop. The first term is the familiar AB 
phase due to a charge going around in a loop. The 
second is the Berry phase due to 2p vortices going 
around Nenc particles, with each particle producing 
a phase of 27. Replacing Nene by its average value 
pA shows that, on average, ®* is equal to the AB 
phase from an “effective” magnetic field 


BY = B — 2ppóo [34] 


The composite fermions thus experience an effective 
magnetic field B* which is much smaller than the 
external, applied field B. That lies at the heart of the 
phenomenology of this lowest Landau level liquid. 
One treats composite fermions as noninteracting 
in the simplest approximation. They form their own 
Landau-like levels in B*. Their filling factor is 
defined as »v*= pġo/B*, with which eqn [34] 
becomes equivalent to eqn [32]. The effective field 
B* can be antiparallel to B, in which case 
v* = póo/ B* is formally negative. For negative values 
of v*,9,. in eqn [31] is defined as ® ,,., = [6,.]', 
because complex conjugation is equivalent to 
switching the direction of the magnetic field. 


Fermion Chern-Simons Theory 


Lopez and Fradkin (1991) developed a field-theoretic 
formulation of composite fermions through a singular 
gauge transformation defined by 


2p 
ej — Zk 
y = ( y [35] 
[| |z; — Ze 


under which the eigenvalue problem of eqn [29] 
transforms into 


H'y! = EW’ [36] 
/ 1 E Palais atzi T. 
H = Fp, Da (Pit CAC) -a(n)) «V [37 
. po 
a(ri) = zz 2 Vib [38] 
where 


Pik = iln d. 
|j — zk] 


is the relative angle between the particles j and k. The 
magnetic field corresponding to a(r;) is given by 


b; = V; x alri) = 2poo » ir = ri) [39] 
l 
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The above transformation thus amounts to attaching 
a point flux of strength —2póo to each electron, 
which is how the composite fermion is modeled 
in this approach. (A flux quantum is topologically 
equivalent to a vortex.) This definition is reminis- 
cent of the treatments of particles obeying fractional 
statistics (*anyons") introduced by Leinaas and 
Myrheim (1977) and Wilczek (1982); an anyon is 
modeled as an electron bound to a point flux of 
magnitude ado, where œ determines the winding 
statistics. 

It is not possible to proceed further without making 
approximations. The usual approach is to make a 
“mean-field” approximation, which amounts to 
spreading the point flux on each electron into a 
uniform magnetic field. Formally, one writes 


A-a=A*+ 6A [40] 
Vx A* =B"? [41] 


The transformed Hamiltonian is written as 


H' (b. +£a'(n)) +V + v’ 


=H +V +V [42] 


V is the Coulomb interaction and V' denotes the 
terms containing 6A. The solution to Hj is trivial, 
describing free fermions in an effective magnetic 
field B*. We have thus decomposed the Hamiltonian 
into a part Hj, which can be solved exactly, and the 
rest, V + V', which is to be treated perturbatively. 

Lopez and Fradkin recast the problem in the 
language of functional integrals, which is suitable 
for studying corrections to the mean-field theory. 
One writes the zero-temperature quantum partition 
function 


E | | DVD Da exp (5 s) [43] 


S= { d^ j dtl 44] 


L = v" (19, — ao) + » | (-ibv t “A i ajul 


OEL xa + [ drove - r)p(r) [45] 
2póo 
where y and v* are anticommuting Grassmann 
variables. The flux attachment is introduced 
through a Lagrange multiplier a9; because ao enters 
linearly in the action, it can be integrated out to 
produce a delta function that imposes the 
constraint 


V x a(r) = 2póop(r) = 2póov" (r)u(r) |46] 
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This formalism is closely related to the topological 
Chern-Simons (CS) field theory. Recall that the CS 
Lagrangian has the form 


Leos me AP = 2e"^A,0,A; [47] 


where F,, —0,A, — 0,A,, and &"^ is the antisym- 
metric Levy-Civita tensor, with e°"? — 1. The index 
takes values ~=0,1,2, the first being the time 
component and the remaining space components. 
The CS action is invariant, up to surface terms, 
under a gauge transformation, because the change in 
Lcs under a functional variation 6A, = 0,A is a total 
derivative. 

Zhang et al. (1989) noted that the term propor- 
tional to aoV x a in eqn [45], which enforces flux 
attachment, is precisely equal to the CS Langrangian 
in the Coulomb gauge. Write 


1 
Leos = Apdo € a,,0,a 
48] 
= adja; — — ig. Apa; 
2bóo ° "1 AP bo 
where i,j represent the spatial components 


(i,j=1,2), and the time components have been 
displayed explicitly in the second step (0) =0,). The 
first term on the right-hand side of eqn [48] is 
identical to the third term on the right-hand side of 
eqn [45]. In the Fourier space the last term is 
proportional to 


ai(q, w)(—iw)a;(—4, —u) [49] 


By choosing the x-axis along q, the Coulomb gauge 
condition g.a=0 implies a2(q,w)=0, guaranteeing 
that the last term in eqn [48] is identically zero. 

The constraint of eqn [46] is used to eliminate 
the two factors of density in the last term of 
eqn [45]. The action is then quadratic in the 
fermion field, which can be integrated out. Various 
response functions can be expressed as correlation 
functions of the vector potential field and their 
averages over the CS field configurations are 
evaluated perturbatively by standard diagrammatic 
methods. 

The fermion CS theory is believed to capture the 
topological properties of composite fermions, but 
has not lent itself, because of the lack of a small 
parameter, to quantitative calculations. It is not 
known what classes of Feynman diagrams will need 
to be summed to eliminate the electron mass mp 
(which is not a parameter of the lowest Landau 
problem - see eqn [30]) in the fermion CS 


approach). Halperin et al. (1993) proceeded by 
replacing mp by an adjustable parameter m*, 
interpreted as the composite-fermion mass. Murthy 
and Shankar (1997) proposed to separate out 
the inter- and intra-Landau level degrees of 
freedom by making a sequence of further 
transformations. 


Consequences 
Fractional Quantum Hall Effect 


The CF theory provides a simple understanding of 
why gaps open up at “fractional” fillings, which 
happens at those fillings v =f for which composite 
fermions fill integral numbers of CF Landau levels. 
That results in Hall plateaus at Ry — b/fe? in the 
presence of disorder. The fractional QHE is thus 
understood as the integral QHE for composite 
fermions. 


Sequences of Fractions 


The integral fillings of composite fermions corre- 
spond to fractional fillings of electrons given by 


in| 
= 一 一 一 一 50 
j 2p|n| +1 ps 
which are precisely the observed fractions. Some of 
these are: 


f= [51] 
(- 3 ppo [52] 
page ppo [53] 
f= ==, — = [54] 


Particle-hole symmetry in the lowest Landau level 
also implies fractions 1 — f. The fractions appear 
in the form of sequences because they are all 
derived from the sequence of integers. The Hall 
quantization is exact because the right-hand side 
of eqn [50] is made up of whole numbers and 
therefore is not susceptible to small perturbations 
in the Hamiltonian. The CF theory unifies the 
FQHEs and IQHEs. 


Fermi Sea at Half Filling 


Equation [50] is consistent with the fact that only 
odd-denominator fractions have been observed in 
the lowest Landau level (i.e., with f < 1). Halperin 
et al. (1993) and Kalmeyer and Zhang (1992) 
proposed that at the simplest even-denominator 
fraction, namely v — 1/2, composite fermions form 
a Fermi sea. This was motivated by the fact that the 
effective magnetic field is B* —0 at v—1/2. A 
number of experiments have directly measured the 
Fermi sea of composite fermions. The TSG effect 
with f —1/2 is absent because the Fermi sea has 
gapless excitations. 


Effective Magnetic Field 


For small values of B* (ie., in the vicinity of 
v — 1/2), the cyclotron radius of composite fermions 
can be very large compared to the radius of the 
cyclotron orbit of a classical electron in B. Direct 
measurements of the cyclotron orbit in several 
geometric experiments have confirmed that the 


charge carriers experience a magnetic field B* rather 
than B. 


Fractional Charge 


Laughlin (1983) showed that the presence of a gap 
at a fractional filling implies the existence of 
fractionally charged excitations. He obtained an 
excitation through the adiabatic insertion of a point 
flux quantum at, say the origin, which can be 
gauged away at the end leaving behind an exact 
excited state. The Faraday's law implies that the 
azimuthal component of the induced electric field is 
E, — —(2zr) ! dó/dt. The current density then is 
jy =H E,, where on = fe? /b is the Hall conductivity. 
The charge leaving the area defined by a circle of 
radius r per unit time is 2z7j,. The total charge 
leaving this area in the adiabatic process then is 


O = | 2a di 三 一 OHdo = —fe [55] 


The charge excess associated with the excitation is 
therefore fe. It is in general not an elementary 
excitation. For f=n/(2pn +1), it can be shown to 
be a collection of » elementary excitations, giving a 
charge of e* —e/(2pn +1) for a single elementary 
excitation. 


Microscopic Tests 


Exact solutions of the Schrodinger equation can be 
obtained, for a finite number of particles, by a brute- 
force diagonalization of the Hamiltonian in the 


Fractional Quantum Hall Effect 409 


lowest LL subspace, which enables a rigorous and 
nontrivial testing of the CF theory. Figure 3 shows 
some typical comparisons, which help test both the 
qualitative and the quantitative aspects of the CF 
theory in a model-independent manner. The low- 
energy spectrum of interacting electrons at B is 
explicitly seen to have a one-to-one correspondence 
to that of weakly interacting electrons at B*. 
Furthermore, there is a remarkably good quantita- 
tive agreement. The predicted energies agree with 
the ‘exact energies to better than 0.05%, and the 
overlaps between the wave functions of eqn [31] 
with the exact eigenfunctions are close to 100%. 
Such comparisons are even more convincing in light 
of the fact that the wave functions of eqn [31] 
do not contain any adjustable parameters for the 
states at v in eqn [50], because the ground state 
wave function and its low-energy excitations at 
v* =n are unique and fully known: the former is 
the Slater determinant corresponding to n filled 


Figure 3 Exact spectra (dashes) for several particle numbers 
at v= 1/3, 2/5, and 3/7. Dots show the CF prediction for the 
energy, obtained with no adjustable parameters. The electrons 
are taken to be confined on the surface of a sphere in the 
presence of a radial magnetic field; L is the total orbital angular 
momentum, and each dash represents a multiplet of 2L+ 1 
degenerate states. The number on top is the dimension of the 
Fock space in the corresponding L sector. Reproduced from 
Jain JK (2000) The composite fermion: a quantum particle and 
its quantum fluids. Physics Today 39(4): 39—42, with permission 
from American Institute of Physics. 
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Landau levels, and the latter are the excitons. 
The predicted energies are calculated by determining 
the expectation values of the full Hamiltonian 
of eqn [30] with respect to the wave functions in 
eqn [31]. 


More Physics 
Spin 


At small Zeeman energies, partially spin-polarized 
or spin-unpolarized FQHE states become possible. 
The TSG effect with spin is well described by a 
generalization of the CF theory. The observed 
fractions are still given by eqn [50], but with 


n-—n +n] [56] 


where n+ is the number of occupied spin-up Landau- 
like CF bands and n, is the number of occupied 
spin-down Landau-like CF bands. There are in 
general several states with different spin polariza- 
tions possible at any given fraction. The observed 
quantum phase transitions as a function of the 
Zeeman energy, which can be changed by increasing 
the parallel component of the magnetic field, are 
consistent with this picture. Direct measurements of 
the spin polarization further confirm this, but also 
see evidence for certain additional fragile states, 
which are presumably caused by the residual 
interaction between composite fermions. 


Bilayers 


It has been proposed that for two parallel 2DES 
planes at small separations and at total filling v — 1, 
neutral interlayer excitons (each exciton made up of 
an electron in one layer and a hole in the other) 
undergo Bose-Einstein condensation, producing a 
true off-diagonal long-range order. Tunneling and 
transport experiments by Eisenstein and collabora- 
tors provide evidence for nontrivial behavior under 
such conditions. The resistivity in the antisymmetric 
channel is very small but does not vanish. 


Pairing 


An even-denominator fraction f — 5/2 has been 
observed. Writing 5/2=2+1/2 and noting that 
the lowest LL contributes 2 (counting the spin 
degree of freedom), v — 5/2 corresponds to a filling 
of 1/2 in the second Landau level. The most 
promising scenario for the explanation of the 5/2 
effect is that composite fermions form a p-wave 
paired state, which opens up a gap to excitations. 
This state is believed to be well described by a 


Pfaffian wave function proposed by Moore and 
Read (1991) 


1 
PE 
mam je A 


x | [(z — 2) exp C >> al [57] 
k 


i<j 


The Pfaffian of an antisymmetric matrix M is 
defined, apart from an overall factor, as 


Pf(M;) = A(M12M34 ... Mn_1.N) [58] 


where A is the antisymmetrization operator. The 
Bardeen-Cooper-Schrieffer wave function 


Vpcs = A[óo(ri, r2)ġo(r3, T4) ... Go(N-1, fN)] [59] 


has the same form as the Pfaffian in eqn [58]. 
Hence, Pf 1/(z; — z;) describes a p-wave pairing of 
electrons, and "2 is interpreted as a paired state of 
composite fermions carrying two vortices. 


FQHE of Composite Fermions 


Recently, some fractions other than those in eqn [50] 
have been observed, for example, f —4/11 and 
f=5/13. These are understood as the delicate 
“fractional” OHE of composite fermions at v* = 1 + 
1/3 and v* — 1 + 2/3. 


TSG Effect in Higher Landau Levels 


The short-range part of the Coulomb interaction 
is less effective in higher Landau levels because 
of the greater spread of the electron wave func- 
tion. As a result, composite fermions are less 
stable, often losing to charge density wave states. 
A few fractions have been observed in the second 
Landau level (1/3, 2/3, 2/5, 1/2) and one (1/3) in 
the third. 


Edge states 


There is a gap to excitations in the bulk at the 
magic fillings of eqn [50], but there is no gap at the 
edge of the sample. The dynamics of the low-energy 
edge excitations is formally equivalent to that of a 
chiral one-dimensional Tomonaga-Luttinger liquid. 
Wen (1991) argued that the exponent characterizing 
the long-distance behavior of this liquid is quan- 
tized, fully determined by the filling factor of the 
bulk state. Experimental studies of the tunneling 
of an external electron into the edge of an 
FQHE system provide evidence for a nontrivial 


Free Interfaces and Free Discontinuities: Variational Problems 411 


Tomonaga-Luttinger liquid but do not find the 
predicted universal value for the edge exponent. 
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introduction 


In several models coming from very different 
applications, one needs to describe physical phe- 
nomena where the state function may present some 
regions of discontinuity. We may think, for instance, 
of problems arising in fracture mechanics, where the 
function which describes the displacement of the 
body has a jump along the fracture, phase transi- 
tions, or also of problems of image reconstruction, 


where the function that describes a picture (the 
intensity of black, e.g., in black-and-white pictures) 
has naturally some discontinuities along the profiles 
of the objects. 

The Sobolev space analysis is then no longer 
appropriate for this kind of problem, since Sobolev 
functions cannot have jump discontinuities along 
hypersurfaces, as, on the contrary, is required by the 
models above. For a rigorous presentation of 
variational problems involving functions with dis- 
continuities, the essential tool is the space, BV, of 
functions with bounded variation. The first ideas 
about this space were developed by De Giorgi in the 
1950s, in order to provide a variational framework to 
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study the problems of minimal surfaces, and several 
monographs are now available on the subject. We 
quote, for instance, the classical volumes of Evans 
and Gariepy (1992), Federer (1969), Giusti (1984), 
Massari and Miranda (1984), Ziemer (1989), and the 
recent book by Ambrosio et al. (2000), where a 
systematic presentation is given, also in view of the 
applications mentioned above. 


The Space BV 


Consider a generic open subset Q of R, which, for 
simplicity, we take bounded and with a Lipschitz 
boundary. In the following, we denote by £'(E), or 
simply |E|, the Lebesgue measure of E in R, while 
H? denotes the k-dimensional Hausdorff measure. 


Definition 1 We say that a function u € L!(Q) is a 
function of bounded variation in Q if its distribu- 
tional gradient Du is an R^-valued finite Borel 
measure on Q. In other words, we have 


| «Dax = - | oad 
Q Q 
Vb eC, VWr=1,...,.N [1] 


where Diu are finite Borel measures. The space of all 
functions of bounded variation in Q is denoted by 
BV(€). 


The space BV(Q) is clearly a vector space and, 
with the norm 


Ilva) = lullig + IDul(Q) [2] 


it becomes a Banach space. The total variation 
IDu|(0) appearing above is intended as 


iDu|(€2) 
N 
= sup} X> fda p € C*(Q; RN), o| € ] 
i=1 Y9 


= sup| - f udivode: óc CP(@;R"), lo| < i} 
0 


and is sometimes indicated by J |Dul. The space 
BVj4,(Q) is defined in a similar way, requiring that 
u € BV(’) for every Q CC Q. 

From the point of view of functional analysis, the 
space BV(Q) does not verify the nice properties of 
Sobolev spaces. In particular, 


e the Banach space BV(€)) is not separable; 

e the Banach space BV(Q)) is not reflexive; and 

e the class of smooth functions is not dense in 
BV(Q) for the norm [2]. 


The above issues motivate why the norm [2] is not 
very helpful in the study of variational problems 
involving the space BV(Q). On the contrary, the 
weak” convergence defined below is much more 
suitable to treat minimization problems for integral 
functionals. 


Definition 2 We say that a sequence (u,) weakly’ 
converges in BV(Q) to a function u € BV(Q) if u, — u 
strongly in L'(Q) and Du, —> Du in the weak" 
convergence of measures. 


The weak” convergence on BV(Q) satisfies the 
following properties: 


e Compactness Every bounded sequence in BV({) 
for the norm [2] admits a weakly’ convergent 
subsequence. 

e Lower-semicontinuity The norm [2] is sequen- 
tially lower-semicontinuous with respect to the 
weak" convergence. 

e Density Every function «€ BV(O) can be 
approximated, in the weak' convergence, by a 
sequence (un) of smooth functions. 


The density property above can be actually made 
stronger: in fact, the approximation of (u,) to u 
holds in the sense that 


un — u strongly in L'(Q) 
Du, — Du weakly" as measures 


iDu,|(9) > |Du|(Q) 


Further properties of the space BV(Q) concern 
the embeddings into Lebesgue spaces, traces, 
and Poincaré-type inequalities. More precisely, we 
have: 


e Embeddings The space BV(Q) is embedded 
continuously into LN/(N-U(Q) and compactly 
into L^(Q) for every p < N/(N — 1). 

e Traces Every function 4 € BV(Q) has a bound- 
ary trace which belongs to L!(00), and the trace 
operator from BV(Q) into L! (ðN) is continuous. 

e Poincaré inequalities There exist suitable con- 
stants cl and cz such that for every u € BV(Q) 


/ bu dx EFA 
Q 


| Iu — ug| dx < c3|Du|(Q) 
Q 


(where $i = 而 | ude) 
Q 


|Du\(Q) + I. 四 ans 
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Sets of Finite Perimeter 


An important class of functions with bounded 
variation are those that can be written as 1g, the 
characteristic function of a set E, taking the value 1 
on E and 0 elsewhere. This is the natural class where 
many phase-transition problems with sharp inter- 
faces may be framed. 


Definition 3 For a measurable set ECR the 
perimeter of E in €) is defined as 


Per(E, Q) — |D1g|(Q) 


The equality above is intended as Per(E, Q) — +00 
whenever 1g € BV(Q). If Per(E, Q) < 十 co then the set 
E is called a set of finite perimeter in Q. 


Note that by the compactness property above for 
BV functions, a family of characteristic functions of 
sets with finite perimeter in a bounded open set Q 
with equibounded perimeter is weakly" -precompact, 
and its limit is of the same form. 

For a set E of finite perimeter in Q, we may define 
the inner normal versor and the reduced boundary 
as follows. 


Definition 4 Let E be a set of finite perimeter in Q. 
We call reduced boundary O*E the set of all points 
x € Qspt|D1g| such that the limit 


11. D1p(B;(x)) 
vele) = 11:108, (x) 


exists and satisfies |vg(x)|=1. The vector vg(x) is 
called the generalized inner normal versor to E. 


In order to link the measure-theoretical objects 
introduced above with some structure property of sets 
of finite perimeter, we introduce, for every t € [0, 1] 
and every measurable set E C RN, the set Et defined by 


gem 043 |E N B,(x)| — 


For instance, if E is a smooth domain of RY, E! is 
the interior part of E, E? is its exterior part, while 
E! is the boundary OE. 

The main properties of the reduced boundary and 
of the generalized inner normal versor are stated in 
the following result. 


Theorem 5 Let E be a set of finite perimeter in Q. 
Then its reduced boundary OE coincides H` -a.e. 
with the set E! introduced in Definition 3, and we 
have the equality 


Per(E,2) = HHA N 8*E) 2 HN 1 (Qn n E!) 


Moreover, the generalized inner normal versor vg(x) 
exists for HN-!-a.e. x € O* E, and we have 


Dig = ve(x)HN ! LO*E 


Note that the lower-semicontinuity of |D1g|(Q) 
entails the lower-semicontinuity of E— HN (Qn 
O'E) with respect to the weak" -convergence of 1p. 
As a consequence, we may apply the direct methods 
of the calculus of variations to obtain, for example, 
existence of minimizers of 


min Per(E,R) - fe àx| 
E 


that are sets with prescribed mean curvature g. This 
lower-semicontinuity property can be further gen- 
eralized, for example, as in the following result for 
anisotropic perimeters. 


Theorem 6 Let :SN-! — R be a Borel function. 
The energy 


| p(ve) dH! 
NPE 


is lower-semicontinuous with respect to the weak*- 
convergence of 1g in BV(Q) if and only if the 
positively one-bomogeneous extension of p from 
SN-! to RN is convex. 


This result immediately implies the existence of 
solutions of isovolumetric problems of the form 


mind 人 y(ve) dHN: |E| = c} 


whose solutions are obtained by suitably scaling the 
Wulff shape of y. 


The Structure of BV Functions 


The simplest situation occurs when N = 1 and so Q is 
an interval of the real line. In this case, decomposing 
the derivative w into positive and negative parts, and 
taking their primitives, we obtain that u € BV(Q) if 
and only if u is the sum of two bounded monotone 
functions (one increasing and one decreasing). There- 
fore, in the one-dimensional case, the BV functions 
share all the properties of monotone functions. 

The situation is more delicate when N > 1, for 
which we need the notion of approximate limit. 


Definition 7 Let 4 € BV(Q). We say that u has the 
approximate limit z at x if 


CON TS 
7 一 0 


u(y) —21dy 20 
iB] O ~ 219 
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The set where no approximate limit exists is called 
the approximate discontinuity set, and is denoted by 
$,. In a similar way, when x € S, we may define the 
approximate values z* and z`, by requiring that 


1 
lim 一 -一 一 一 —z*|dy=0 
idis (x, v)| B; zs uly) | , 


myg x,v) Esai. "M 


where 


)-z |dy=0 


B; (x,v) = {y € B,(x): (y -x):v > 0j 
B, (x,v) = {y € B(x): (y-x):v «0j 


Analogous definitions can be given in the vector- 
valued case, when u € BV(Q; R”). 


The triplet (z*,z~,v) in Definition 7 is unique up 
to interchanging z^ with z and changing sign to v, 
and is denoted by (u*(x),u (x), v, (x)). 

We are now in a position to describe the structure 
of the measure Du when u € BV(Q), or more 
generally u € BV(Q; R”). We first apply the 
Radon-Nikodym theorem to Du and we decompose 
it into absolutely continuous and singular parts: 
Du = (Duy + (Duy. We denote by Vu the density of 
the absolutely continuous part, so that we have 


Du = Vu- £^ + (Duy 


The singular part (Du) can be further decomposed 
into an (N — 1)-dimensional part, concentrated on 
the approximate discontinuity set $,, and the 
remaining part, which vanishes on all sets with 
finite HV ! measure. More precisely, if uc 
BV(Q; R”), we have 


Du — Vu: £^ + (u* (x) — u^ (x)) 
@ v(x) H^! LS, + (Du) [4] 


the three terms on the right-hand side are mutually 
singular and are, respectively, called the absolutely 
continuous part, the jump part, and the Cantor part 
of the gradient measure Du. 

In the vector-valued case, Du is an m x N matrix 
of finite Borel measures, Vu is an m x N matrix of 
functions in L'(Q), and the jump term in [4] is an 
(N — 1)-dimensional measure of rank 1. The struc- 
ture of the Cantor part (Du) is described by the 
Alberti’s rank-1 theorem (see Alberti (1993)). 


Theorem 8 For every u € BV(Q; R") the Cantor 
part (Du) is a measure with values in the m x N 
matrices of rank 1. 


Convex Functionals on BV 


Many problems of the calculus of variations deal 
with the minimization of energies of the form 


u) = | feu Du) dx [5] 


The direct methods to obtain the existence of at 
least a minimizer require some coercivity hypotheses 
on F, as well as its lower-semicontinuity. This 
last issue, already rather delicate when working 
in Sobolev spaces (see, e.g., Buttazzo (1989) 
and Dacorogna (1989)), presents additional difficul- 
ties when the unknown function z varies in the 
space BV(Q), due to the fact that Du is a measure, 
and the precise meaning of the integral in [5] has to 
be clarified. 

In this section, we limit ourselves to consider the 
simpler situation of convex functionals, and we also 
assume that the integrand f(x, u, Du) depends only 
on x and Du. It is then convenient to study the 
problem in the framework of functionals defined on 
the space of finite Borel vector measures M(Q; R^). 
Let f : RN x Re — [0, +00] be a Borel function such 
that 


e f is lower-semicontinuous, and 
e f(x,-) is convex for every x € R^. 


We denote by f*(x,z) the recession function 
associated with f, given by 


| X, 2 + tz 
f° (x,z) = lim f (æ, Zo + tz) 

1 一 十 cc 
where zo is any point in R* such that f(x, zo) < 十 co 
(in fact, the definition above is independent of the 
choice of zo). Then we may consider the functional 


N= [ feo )dz+ f f(x s) a 6 


where 和 = 和 .dx 十 和 is the Lebesgue-Nikodym 
decomposition of A into absolutely continuous and 
singular parts, and the notation dA*/d|A‘| stands for 
the density of 和 X with respect to its total variation 
|A|. For simplicity, the last term on the right-hand 
side of [6] is often denoted by fa f^ (x, X^). 

For the functional F, the following lower- 
semicontinuity result holds (see, e.g., Buttazzo 
(1989). 


Theorem 9 Under tbe assumptions above the func- 
tional [6] is sequentially lower-semicontinuous for tbe 
weak* convergence on M(Q; R^). Moreover, if 
f(x.) > colz| — a(x) 
with co > 0 and a € L'(Q) [7] 
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then the functional F turns out to be coercive for the 
same topology. 


From Theorem 9 we deduce immediately a lower- 
semicontinuity result for functionals defined on 
BV(Q; R”). 


Corollary 10 Under the assumptions above on the 
integrand f(with k=mN) the functional defined on 
BV(Q; R") by 


F(u) = [ f(x, (Du)*) dx 


d(Duy 
Y ———zdiDul 8 
is sequentially lower-semicontinuous for tbe weak* 
convergence. Moreover, under tbe assumption |7] 
the functional F is coercive with respect to the same 
topology. 


For some extensions of the result above to the case 
when f(x, -) is quasiconvex (in the vector-valued 
situation m > 1), we refer the interested reader to 
Fonseca and Müller (1992) and references therein. 

Fixing boundary data is another difference 
between variational problems on Sobolev spaces 
and on BV spaces. Due to the fact that the class 
{u € BV(Q): u— uo on OO} is not weakly" closed, to 
set in a correct way a minimum problem of Dirichlet 
type on BV(Q) with datum € BV(R) it is 
convenient to consider a larger domain (Y 5D Q 
and for every u € BV(O) the extended function 


_ » 
|^ = 
HQ 


whose distributional gradient is 
Di = DuLQ + DugL \ Q 
+ (ug 一 u)vo^t^ 一 LOO 


on € 
on V \ Q 


vo being the exterior normal versor to 2. We have 
then the following functional on BV({’): 


Ea) = | f(x, (Di)*) dx + | f°(x, (Di) 
()’ o 


- [ F(x, (Du\*) d + (x, (Duo)*) dx 


Q^Q 
+ [ fro Duy) [f Du) 
Q Jon 
+ f fs (uo — uja) dit? 
ao 
If we drop the constant term 


di? (Duo)*) dx + 


QAQ JA 


| (Duo)”) 


irrelevant for the minimization, we end up with the 
functional 


Fy. (1) = F(u) Ei L Fr. (uo ans u)vo) dHN-! 


where F is as in [8]. The Dirichlet problem we 
consider is then 


min F(u) 十 I. Fo. (uo 一 u)vo)d H^: 
u € Bv) [9] 


For instance, if f(z)= |z|, problem [9] becomes 


min / |Du| +f lu — up| dHN-1: we 5v 
Q IQ 


Under the assumptions considered, the problem 
above admits a solution » € BV(Q), but in general 
we do not have 4— u$ on OQ in the sense of BV 
traces. 


Nonconvex Functionals on BV 


In order to introduce the class of nonconvex 
functionals on BV(Q), let us denote v — Du so that 
every functional ®(v) provides an energy F(u). If we 
work in the setting of Sobolev spaces, we have 4 € 
W'^(Q) (p > 1), which implies v € L^(0; RY); now, 
it happens that in this case all “interesting” 
functionals ® are convex. More precisely, it can be 
proved that a functional 9: L^(0; RN) — [0, --oc], 
which is 


e sequentially lower-semicontinuous for the weak 
convergence of L^(Q; RF), and 

e local on L^(Q; R) in the sense that ®(v + w) = 
(v) + P(w) whenever v -w =0 in Q9, 


has to be necessarily convex, and of the form 
&(v) = | d(x, v(x) de 
Q 


for a suitable integrand $ó such that ó(x,-) is 
convex. Then the energies F(u) defined on Sobolev 
spaces and obtained by a functional ®(v) through 
the identification v=Du are necessarily convex. 
This is no longer true if ® is defined on the space 
M(Q; RN) of measures, and hence F is defined on 
BV(Q). The first example of a nonconvex functional 
$ on M(Q;RYX) in the literature comes from the 
so-called Mumford-Shah model for computer vision 
(see below) and is given by 


D(A) = [ (x) dx + #(Ay) 
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where A? is the absolutely continuous part of A, Aj is 
the set of atoms of A, and # is the counting measure. 
The functional ® is set equal to 十 oo on all measures 
A whose singular part à is nonatomic. A general 
representation result (see Bouchitte and Buttazzo 
(1992) and references therein) establishes that a 
functional 更: M(Q; RN) — [0, --oc], which is 


e sequentially lower-semicontinuous for the weak" 
convergence of M(Q; R^), and 

e local on .M(Q; RN) in the sense that (À + v) = 
®(\) + ®(v) whenever A and v are mutually 
singular in Q, 


has to be of the form 


P(A) = | $69) du | (9 
Tp d 
NE (x) dy 


where u is a non-negative measure, À= A? - dx + 
AS + A* is the decomposition of A into absolutely 
continuous, Cantor, and atomic parts, ó(x,v) is an 
integrand convex in v, and ó"? is its recession 
function. The novelty is now represented by the 
integrand v(x,v) which has to be subadditive in v 
and satisfying the compatibility condition 


lim P(x, tv) _ lim w(x, tv) 


t— 十 00 t t—0* t 


When @ has a superlinear growth the condition 
above gives that the slope of w(x, - ) at the origin has 
to be infinite. For instance, in the Mumford-Shah 
case we have 


o(x,v) = |z W(x,v) 三 F . : a : [10] 


Coming back to the case u € BV(Q), we have the 
decomposition (see [4]): 


Du = Vu-LN + (Du) + [u]v,(x) dice. 
where we considered, for simplicity, only the scalar 


case m=1 and denoted by [u] the jump wu — u`. 
We have then the functional 


F(u) = | olx, Vu)de+ [ (x (Du)*) 
+ vx, flv) dH 
JS, 


For instance, in the  homogeneous-isotropic 
case, when ó(x,v) and w(x,v) are independent 


of x and depend only on |v|, the formula above 
reduces to 


F(u) = 人 $(|Vul)dx + B|Dul (9) 


u N-1 
" 上 WO)dxt [11] 


where 3, ó, i» satisfy the compatibility condition 


T" . W(t) 

p= ° (1) = lim —— [12] 
In the original Mumford-Shah model for computer 
vision, Q is a rectangle of the plane, uo: Q — [0, 1] 
represents the gray level of a picture, cı and c»; are 
positive scale and contrast parameters, and the 
variational problem under consideration is 


mind f [Vu] dx + C1 f lu — uo| dx 
Q Q 


t eyHN- (S): (Duy = T [13] 


The solution uw then represents the reconstructed 
image, whose contours are given by the jump set Sp. 
We refer to Giorgi and Ambrosio (1988) and to the 
book by Morel and Solimini (1995) for further 
details about this model. 

Analogously, in the case of the study of fractures 
of an elastic membrane, a problem similar to [13] 
provides the vertical displacement u of the mem- 
brane, together with its fracture set $,. We refer to 
some recent papers (see Dal Maso and Toader 
(2002) and Francroft and Marigo (1998), and 
references therein) for a more detailed description 
of fracture mechanics problems, even in the more 
delicate vectorial setting of elasticity. 

Using the functional F in [11] we have the 
generalized Mumford-Shah problem, 


mind F(u) +c] J ju — ug dx: u € svi) | 
Q 


where @ is convex, wv is subadditive, and the 
compatibility condition [12] is fulfilled. 

If we set K— $, and assume that it is closed, the 
Mumford-Shah problem can be rewritten as 


nin f vad ea | lu — uo| dx 
O\K O\K 


+ oH (K n9) : K CQ closed, 


uc Ay 
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and this justifies the name “free discontinuity 
problems,” which is often used in this setting. 

The regularity properties of optimal pairs (u, K) 
are far from being fully understood; some partial 
results are available but the Mumford—Shah 
conjecture: 


e in the case N =2 for an optimal pair (u, K) the set 
K is locally the finite union of C'! arcs 


remains still open. We refer to Ambrosio et al. 
(2000) for a list of the regularity results on the 
problem above that are known thus for. 
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Introduction 


Free probability is a probability theory adapted to 
quantities with the highest degree of noncommutativ- 
ity. A basic feature of this is that the definition of 
independence is modified in such a way that the freely 
independent random variables will not commute in 
general. The exploration of this notion of indepen- 
dence, which was initially motivated by questions 
about operator algebras (Voiculescu 1985), has 
produced a theory that runs parallel to an unexpect- 
edly large part of classical probability theory. The 
applications of the theory have also gone into 
unexpected directions, once it turned out that the 
large-N limit of systems of random matrices is a key 
asymptotic model in the theory (Voiculescu 1991). 
There are several signs like the connections to large N 
for random matrices and to the combinatorics of 
noncrossing partitions (Speicher 1998) (which corre- 
spond to certain planar diagrams), that perhaps these 
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connections may go even further towards the large-N 
limit of models in gauge theory. 

In this article the noncommutative probability and 
the random matrix angle will be emphasized and 
very little will be said about the operator algebras 
and the combinatorics. After discussing free inde- 
pendence and models based on free products of 
groups and creation and annihilation operators on 
the Boltzmann full Fock space, we continue with the 
semicircle law, which is the substitute for the Gauss 
law in this context, and with the nonlinear free 
harmonic analysis arising from addition and multi- 
plication of free random variables. 

We then devote two longer sections to the 
asymptotic free independence of large random 
matrices and to free entropy, the free probability 
analog of Shannon’s information-theoretic entropy 
for continuous random variables. 


Freeness of Noncommutative 
Random Variables 


Classical probability deals with expectation values 
of numerical random variables, that is, with 
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numerical functions on a space of events and with 
their integrals with respect to a probability measure 
on the space of events. In noncommutative prob- 
ability, the random variables, like quantum-mechan- 
ical quantities, are elements of a noncommutative 
algebra A over C, with unit 1€ A, which is 
endowed with a linear expectation functional 
y:A—-C, so that y(1)=1. Frequently, A is a 
*-algebra of operators on some Hilbert space H and 
y(T)= (T£, £) for some unit vector EEH. We call 
(A,y) a noncommutative probability space and the 
elements a € A, noncommutative random variables. 
In this section we shall discuss the basics around the 
notion of freeness (Voiculescu 1985), which plays 
the role of independence in free probability. 

If œ= (ai)ier C A is a family of noncommutative 
random variables, the role of joint distribution is 
played by the collection of noncommutative moments 
(ai, ... aj, ). This can also be extended by linearity to a 
distribution functional $,:C(X;|; € D) —^ C, where 
C(X; |i € I) is the ring of polynomials in noncommu- 
tative indeterminates X;(; € I) and 


o, (P(X;|i € I)) = p(P(aili € I)) 


If A is a C'-algebra of operators on H,a=a* € A 
and o(-)— (£,£), the distribution of a can also be 
identified with the probability measure js on R 


Ha(w) = (E(w; a)6, £) 


where E(-;a) is the spectral measure of a. Indeed, 
then 


$,(P(X)) = | P(t)dpa(t) 


A family (Aj)j-; C A,1 €A; of subalgebras is 
“free” (which is short for freely independent) if 


Q(d1 ...an) =Q 


whenever aj € A;,1 <j € n, ij A ij,1 and (aj) =0. 
(Here it is only required that consecutive a;'s be in 
different A;'s. Thus, we may have 1; — i3, provided 
i1 # i.) 

A family of sets of random variables (w;);-;, wi C A 
is free if the algebras A; generated by 1 U {w,} are free 
in (A,y). 

Except for rather trivial situations, free random 
variables in (A,w) do not commute. 

Note also that, as in the case of classical 
independence, if (w;);-; are disjoint freely indepen- 
dent sets of random variables, then, if the distribu- 
tions $,(/ € I) are given, the distribution ^, of 
w= |ie wi is completely determined. 


Example 1 Let the group G be the free product of 
its subgroups (G;);-;, that is, G is generated by these 


subgroups and there is no nontrivial relation among 
elements of different G;'s. Further, let A be the 
regular representation A(g)ej; —e,, of G on the 
Hilbert space with orthonormal basis (ej),-c. 
Then, with respect to the expectation functional 
T(T)—(Te,,e,) on operators on /7(G), the sets 
(A(G;));.; are freely independent. 


Example 2 If H is a complex Hilbert, let 
TH= Biso HK denote the full Boltzmann Fock 
space, with vacuum vector 1 so that H®?=C1. 
If b € H and £€ TH, let l(b)&£—-b & € denote the 
left creation operator and y(X)=(X1,1) the 
vacuum expectation. Then, if the H,(i € J) 
are pairwise orthogonal subspaces in ^74, the 
*-subalgebras- of operators generated by /(H,) U 
l'(H;, indexed by ;€l, are freely independent 
with respect to wy. 


Free Independence with Amalgamation 
over a Subalgebra 


The classical notion of conditional independence 
also has a free counterpart based on the notion of 
free independence with amalgamation over a sub- 
algebra. This subject is technically more complicated 
and we will only aim at giving an idea about what 
kind of concepts are involved. 

In the classical context, if (X, X, jj) is a probability 
space with a c-algebra X, then the conditional 
independence with respect to a o-subalgebra of 
events, Xo C X, amounts to replacing in the defini- 
tion of independence the expectation functional 
(which is the integral with respect to jj) by the 
conditional expectation functional L*(X,X, u) 5 
L*' (X, Xo, u(329)). 

In free probability, one considers an extension of 
the theory, from the (A,w) framework to an (A,®, B) 
framework (Voiculescu 1995), where A is an algebra 
with unit over C,B31 is a subalgebra, and 
:4 一 B is B-B-bilinear and ®|, —idg. Then the 
definition of B-freeness (or free independence with 
amalgamation over B) of a family of subalgebras 
(A;);-7, B C Aj C A requires that 


whenever aj € A; ij Æ ij41(1 <j € n), and ®(a;) =0. 

In the case of a unital »-algebra of bounded 
operators M with an expectation functional 
T(-)=(-€,€) which is tracial (i.e., 7([724,72]) =0 if 
m,,™m2 € M) and given a subalgebra 1 € N C M, as 
in the classical theory, there is a certain canonical 
construction in operator algebra theory of a “con- 
ditional expectation” $: M— N, where M,N are 


algebras of operators obtained as completion- 
separates from M and N. With this construction, in 
the trace-state setting there is complete analogy with 
the classical notion of conditional independence. 

Several other constructions of free probability 
‘have been extended to the (A,®,B) B-valued 
context. 

A group-theoretic example similar to Example 1 
can be constructed from a group G which is a free 
product with amalgamation over a subgroup H C G 
of subgroups H C G; C Giel. Then A is the 
algebra constructed from the left-regular representa- 
tion of G, whereas B is an algebra constructed from 
the left-regular representation of H. 


The Semicircle Law 


In free probability the semicircle law appears as the 
limit law in the free central limit theorem 
(Voiculescu 1985). Here is a weak, rather algebraic, 
version of this fact: 

If (an)„en are freely independent in (A,w) and 
satisfy the conditions that 


plan) = O(n € N) 


; = j^. 
im NT 2. m 
| &n N 
sup e(at) — C, « oo(k € N) 
ncN 


then, if Sy — N 1? 37,., x as, we have the conver- 
gence of moments of the distribution of Sy to the 
semicircle distribution 


2 
Jim eS) = Qo)" | f*(4.— $7 d 
08 E 

Thus, the semicircle law, given by the density 
(22)! (4 — #7)'/* on [—2,2] is the free analog of the 
(0,1) Gauss law. 

Two coincidences involving the semicircle law 
should be noted. 

The field operators s(b) —2^! (I(b) + I(b)*) on the 
Boltzmann Fock space (Example 2) have semicircle 
distributions with respect to the vacuum expectation 
c(-) — (1,1). It turns out that this goes farther: if 
H=Hr BRC is the complexification of a real 
Hilbert space, then the map Hr Ə h —s(b) is the 
analog in free probability of the Gaussian process 
over the Hilbert space Hr (Voiculescu 1985). It is 
often called the semicircular process over Hg. This 
points to an important connection of free prob- 
ability to the full Boltzmann statistics. 

The other coincidence is that the semicircle law is 
well known as the Wigner limit distribution of 
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eigenvalues of large Gaussian random matrices. As 
we shall see, this is a clue to a deep connection of 
free probability to the large-N limit of random 
matrices (Voiculescu 1991). 


Free Convolution Operations 


In classical probability theory, the distribution of 
the sum of two independent random variables is 
computed by the convolution product of their 
distributions. This has a free probability analog. 
If a,b are free random variables in (A,y) with 
distributions  u54,u5:C[X] — C, then the joint 
distribution ;;,, is completely determined by 
Ha, Hp and in particular 444p, the distribution of a + b, 
also depends only on jig, up. It follows that there 
is an additive free convolution operation FH on 
distributions so that HaHHp = ai, whenever a, b are 
free (Voiculescu 1985). The same can be done with 
multiplication replacing addition, and this defines the 
multiplicative free convolution operation x by 
the equation paXup= Hab, when a,b are free 
(Voiculescu 1985). A slightly surprising feature of x 
is that in spite of noncommutativity of a and b, 
the multiplicative operation x turns out to be 
commutative, which of course is obvious for Œ. 

In the classical context, convolutions are bilinear 
operations which can be computed using integrals. 
The free convolutions are quite nonlinear and their 
computation is via another route, which can also be 
explained by a classical analogy. Classically, the 
logarithm of the Fourier transform linearizes con- 
volution, that is, 


log F(u * v) = log F(u) + log (v) 


and we may compute xv as the (log) of 
log F(u) + log (v). The linearizing transform for FB 
is the R-transform (Voiculescu 1986), which is 
obtained by the following procedure. 

If ,: C[X] 2 C is a distribution, let G,(z) =z + 
So H(X”) ”1, which, in case u is a compactly 
supported probability measure on R, is the Laurent 
series at oo of the Cauchy transform 


J du(t) 
t— Zz 
From this, one obtains, by inversion at oo, the series 


K,, so that G,(K,(z))=z and one defines 
R,,(z) = K,(z) — z !, which is a power series in z. Then 


Ry ay am R, T Ry 
In case the distribution corresponds to a measure, 


the formal inversion amounts to inverting an 
analytic function. 
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For the multiplicative operation 4, it is more 
convenient to describe an analog of the Mellin trans- 
form, that is, no logarithm will be taken. This is the 
S-transform (Voiculescu 1991), obtained as follows. 

If x:C[X]—C is a distribution with p(X) Æ 0, 
one forms w,,(z)= 5,4 #(X”)z” and its inverse y, 
so that v,(x,(z)) =z. Then 


S,(z) = z (14 z)xu(«) 
has the property that 


\ = Sudy 


The free central limit theorem can be easily 
proved using the R-transform. Another easy applica- 
tion of the R-transform is to find the free analog of 
the Poisson law, that is, 

lim ((1 — a/n)6o + a/nó1)* 


n—oo 


where a > 0. The free Poisson law is 


m 


zio- rr if0<a<1 
V ifa>1 


where v has support [(1 —a!/2)*,(1+a!/2)*] and 
density (27t) (4a — (t — (1 +.))7)'/*. This distribu- 
tion is well known in random matrix theory as the 
Marchenko-Pastur distribution, again a coincidence 
pointing to a random matrix theory connection. 

Because probability measures on R are distribu- 
tions of self-adjoint operators and a sum of self- 
adjoint operators is again such an operator, the 
additive free convolution yields an operation on 
probability measures on R. Similarly, it can be 
shown that & gives rise to operations on probability 
measures on (ze C||z| 2 1] and on probability 
measures on [0,00). 

With the R-transform machinery at hand, the free 
analogs of many of the classical results around 
addition of independent random variables have been 
developed (we recommend Voiculescu (1998c) for a 
survey of these developments). This includes the 
classification of infinitely divisible laws (Levy- 
Khintchine type theorem), classification of stable 
laws, domains of attraction, and convolution semi; 
groups. Note that the free laws are rather different 
from the classical ones, but the classification results 
are quite parallel, that is, the indexing parameters 
are almost the same. The situation is similar in the 
multiplicative context. As in the classical case, these 
results about laws yield in particular processes with 
independent increments, which in the free frame- 
work are free increments. 

As in the classical setting, also in the free setting, 
convolution semigroups are connected to differential 


equations. In the additive free case, a semigroup is a 
family (1;),59 of probability measures on R, so that 
[ts = pa EB ps. If G(t,z) is the Cauchy transform of y; 
(which is an analytic function on the half-plane 
Imz>0), the equation (Voiculescu 1986) is a 
semilinear complex PDE: 


OG OG 
Ot Ru (G) s. = 


where R,, is the R-transform of p. In particular, 
when p is the semicircle law, R,,(z) — az o > 0 and 
the PDE is a complex Burgers equation in the upper 
half-plane. 


0 


Noncrossing Partitions 


The series expansion of the R-transform 


Rd) = So Ru)? 


n0 


has as coefficients polynomials R,(u) in the 
moments j/( X^). More precisely, assigning to u(X®) 
a degree b, R,(u) is a polynomial of degree n and 
R,(u) — u(X") 2 polynomial in p(X*) with k <n. 
The linearization property of the R-transform 
implies that 


R,(uE v) = Ry(u) + R«(v) 


For classical convolution, polynomials with simi- 
lar properties satisfying 


Cy (pu * v) = Cí(u) + Calv) 


are called cummulants and satisfy 


log u(e^*) = 3 Cin (1) 2" 


n21 


There are combinatorial formulas involving the 
lattice of all partitions of the set {1,...,7} which give 
the classical cummulants. For free cummulants, like 
R,(u) and generalizations of these, there are similar 
formulas provided the lattice of all partitions is 
replaced by the lattice NC(n) of noncrossing partitions 
(Speicher 1998). A partition 7 —(Vi,...,V,) of 
(1,...,7] is noncrossing if there are noa « b «c«d 
so that (a, c] C Vg, (b, d) c Vi and k Z I. 

More generally, a family R?(a,,...,a,) of free 
cummulants, where a;,...,a, are in some (A,w), is 
defined recursively as follows (Speicher 1998). For 
n=1, one has R'!)(a) 2 (a). If t=(Vi,.--, Vm) € 
NC(zn) where V,={i(1,k) < --- < i(ng, k)}, we 
define 


R[n](a1; EE jän) = I] ROVE (gi py, ose LU 


1<k<m 


—— wiee o 


The recurrence relation for cummulants is then 


p(a1..-4n)= >> R[n|(a1,.-., an) 


T€NC(n) 


Note that the right-hand side involves only R'*)’s 
with k <n and that actually R™ appears only in 
and is equal to R[((1,...,7:1)](a1,...,2,) (the coars- 
est partition). 

A key property of R'?(aj,...,a,) is that if 
(1,...,7] 2 o H 8 and (ak)keas (a))jeg are freely inde- 
pendent, then R?(a4,...,a,) = 

If u is the distribution of a € (A,w), then the 
cummulants R,,(j:) are given by 


The noncrossing condition on partitions corre- 
sponds to a planarity requirement for diagrams and 
as such is very suggestive of connections to planar 
diagrams occurring in the constant term of large-N 
expansions from random matrix theory and more 
generally gauge theory. 

For more details on the subject of noncrossing 
partitions, we refer the reader to the memoir by 
Speicher (1998). 


Asymptotic Freeness of Random 
Matrices 


The explanation for the coincidences between certain 
laws in free probability and in random matrix theory 
is that freeness occurs asymptotically among random 
matrices in the large-N limit (Voiculescu 1991). 

Random matrices can be put in a noncommutative 
probability framework (An, YN), where 
An = L®~°(Q, Mw; do) (the N x N complex matrix- 
valued functions on the probability space (Q, do) 
which are p-integrable for all p € [1,00)) and the 
expectation functional is 


=N7! w)da(w 
=N [ex \do(w) 


The basic example is provided by an z-tuple of 
Gaussian random matrices (Voiculescu 1991). Let 


T = (4%) eN 
i P91) 1<p,q<N 


where ds and the e 1l<p=¢7 =N, 


I&S A are (0, N!)-Gaussian aud independent. 
Then ( E" ^ lesen as N — oo converges in noncommu- 
tative distribution to the freely independent z-tuple 
(l(e;) + l'(ej))12;-, in the Boltzmann Fock space 


TIE 
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context of Example 2 for an orthonormal system 
€1,...,€s € H, that is, convergence of moments: 


. (N) 
lim on (T; 


= (((ei,) T l'(ei, )) PL 


nx 


(I(ei,) +i (ei,))1, 1) 


In particular, the limit variables (/(e;) + l(ej))1-;-;, 
are free. 

More generally, asymptotic freeness of variables 
or sets of variables in (AN, WN) can be defined 
without the existence of a limit distribution, that is, 
by requiring only that the freeness relations among 
noncommutative moments hold asymptotically as 
N — co. 

Note that in these random matrix questions, the 
joint classical distribution of an z-tuple of random 
matrices (X ..,XIN)) in An is a probability 
measure on UM" which contains more informa- 
tion than the collection of noncommutative 
moments, which is the distribution of the noncom- 
mutative variables in (Ay, yn). In particular, for one 
random matrix the classical distribution gives the 
joint distribution of all entries, whereas the non- 
commutative distribution gives information only 
about the distribution of eigenvalues. 

From the Gaussian n-tuple using operator techni- 
ques much more general asymptotic freeness results 
have been obtained. For instance (Voiculesu 1998b): 

Let, (X1 ,os X00, YE... YON) be dees n 
tuples of self-adjoint N x N random matrices with 
classical joint distribution AN on (MÑ )” +”. Assume 
that un is invariant under the action of the unitary 
group U(N) which takes (X1,...,Xm, Yi,..-, Yn) 
into (X4,..., Xm UY1U ,..., UY,U') and assume 
that there is a bound Ra on the operator norms 
yx | and YN independent of N. Then the 
sets ix ., X(N) and {Y\),..., Y‘N)} are asymp- 
deals à as N — oc. 

Note that the uniform bound on the operator 
norms can be easily replaced by weaker conditions. 

Once we know that certain random matrices are 
asymptotically free and that the large-N limit in 
noncommutative distribution exists, the results of 
free probability apply. For instance, if X"! and YM 
are asymptotically free and have limit distributions 
u and v, then the limit distribution of X) + YN 
and of X Y) are the free convolutions iv and, 
respectively, 44x v. 

Free probability techniques have also been suc- 
cessful in dealing with other questions about the 
asymptotic behavior of random matrices. 

If TU,..., T) is an n-tuple of iid. Hermitian 
Gaussian random, then the uniform operator norms 
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of polynomials in noncommutative indeterminates 
have the property that 


Jim \P(T.”,..., T™)| 
= = [iP (e1) +l(e1)",..- Men) + len) “l 


almost surely (Haagerup and Thorbjoernsen). 

This result is a far-reaching generalization of the 
results about largest eigenvalues of one Gaussian 
random matrix. The use of operator-valued free 
random variables (with respect to certain subalge- 
bra) was an essential ingredient in the proof. Also, in 
another direction, freeness of operator-valued free 
random variables was used to obtain a free prob- 
ability treatment of Gaussian random band matrices 
and generalizations of these (Shlyakhtenko 1996). 

Finally, quite recently, extensions of the free 
probability framework have appeared which are 
adapted to the study of fluctuations of systems of 
random matrices in the large-N limit. 


Free Entropy 


There are free probability analogs also for information- 
theoretic quantities (Voiculescu 1994, 1998a). 

Let (fi, ..., fn) be an z-tuple of classical numerical 
random variables the joint distribution of which has 
density p(t1,...,2,) with respect to the n-dimensional 
Lebesgue measure A, on R". The entropy quantity 
associated by Shannon to (fi, . .. , fn) is 


Hifo. cuf == j NOTUM 


The free analog of H(fi,...,f,) is the free entropy 
quantity x(X;,..., Xn). Here X; 2 X5, 1 <j € n, are 
noncommutative self-adjoint random variables in 
(M,r), where M is a *-algebra of bounded operators 
on a Hilbert space H. The expectation functional in 
addition to the positivity properties, equivalent to 
the requirement that it can be defined by a unit 
vector 7(-) — (-£,£), also has the property of a trace 
T(XY)=7(YX) for all X, Y € M. For instance, the 
noncommutative random variables arising from the 
large-N limit of m-tuples of self-adjoint random 
matrices live in noncommutative probability frame- 
works (M,7) of this kind. 

There are two approaches to defining free entropy 
and, since there are only partial results about the 
equivalence of these approaches, the quantities 
obtained are denoted by x(X,..., X,) (Voiculescu 
1994) and x'(X1,..., X,) (Voiculescu 1998a). The 
quantity x is often referred to as the “microstates 
free entropy," its definition being inspired by the 
Boltzmann formula S=k log W, whereas the other 
entropy, sometimes called “microstates-free free 


,5 


entropy," is obtained via a free probability analog 
of the Fisher information (Voiculescu 19982). 

The microstates used to define y are matricial and 
the reason why this choice produced a quantity with 
the right behavior with respect to free independence 
can be found in the asymptotic freeness properties of 
random matrices. 

Given X; - X € Mi € j € n and me N,k € N, 
e>0 the microstates TLT(Xi,..., Xn;m,k,e) are 
n-tuples (A1,..., An) of self-adjoint k x k matrices, 
such that, for noncommutative moments of order up 
to m, we have 

|k- ler, (A; i Ags) — TX X3] «e 


. Ip 


where 1 <p<m,1<ij<n,1<j<p. 
One obtains x(X1,..., X,) by taking the infimum 
over € > 0 and m € N of 


lim sup(&? log vol 工 (. . .) + ” log k) 
k—oo 2 


where vol is the volume on (Mj;")” corresponding to 
the Hilbert-Schmidt norm Hilbert space structure 
(Voiculescu 1994). 

When n= 1, there is a simple formula for x(X). If 
yz is the probability measure on R which represents 
the distribution of X — X* € M with respect to the 
expectation 7, then 


XX) = / log |s — t|du(s)dy(t) + C 


where the exact value of the constant C is 
3/4 + 1/2 log 27. 

For z 1 there is no simple formula for 
X(X41,...,X,), but there are several properties 
which provide a better understanding of this 
quantity. 

If X; are such that x(X;) > —oo, then 


X(X1,..., X4) = x(X1) t: + x(X4) 


if and only if X4,..., X, are freely independent in 
(M,T). Clearly, this property of y with respect to 
free independence is analogous to the property of 
H(fi,...,f,4) with respect to classical independence. 

Further, if F;,...,F, are power series in n 
noncommuting indeterminates, there is a change- 
of-variable formula 


x(Fa (X, . .. Xar- Fo( X1, .., Xa) 
— log | det |(.7 (F)) T x(Xi,..., Xn) 


involving the Kadison—Fuglede positive determinant 
|det| and a certain noncommutative Jacobian 
J\F),F=(Fi,...,F,) defined in M,@®M® M”, 
where M? is the opposite algebra of M. (For 


definitions and the many technical conditions under 
which this formula holds, see Voiculescu (1994).) 

The free entropy x also satisfies semicontinuity, 
subadditivity, and a semicircular bound (analogous 
to the classical Gaussian bound) properties. 

An unexpected feature of xy is a degeneration of 
convexity. If the trace state 7 is a convex combina- 
tion 7T —07' + (1 — 0)7", where 7', 7" are trace states 
and where 7 #7” on the algebra generated by 
X1,..., X4, and n > 1, then 


YA eus deg) — —OQ 


(for a reference consult the survey Voiculescu 
(2002)). 

With the free entropy at hand, an important 
variational problem can be formulated for the 
noncommutative distribution of an n-tuple of self- 
adjoint noncommutative random variables 
Ti,..., T, in the tracial context. The quantity to 
be maximized is 


x (Tis. 


where P is a given self-adjoint polynomial in 
noncommutative indeterminates (see Voiculescu 
(2002) for comments on this problem). If n= 1, 
this is a classical problem for the logarithmic energy 


JJ Tog |s — thd(s)du(¢) - / P(t)du(t) 


where jz is a probability measure on R. 

To explain the second approach, based on Fisher 
information, we begin by recalling some facts about 
Fisher information in the classical context. 

If f is a numerical random variable with distribu- 
tion given by the density p(t) on R, then 


rt J (Z) ra- |G): 


Here d/dt is the differential operator defined on test 
functions in L?(R,p dt). Then 


p — (dV 
D Hh 


The classical connection to entropy is that the Fisher 
information is a derivative of the entropy when the 
variable becomes the starting point of a Brownian 
motion. This can be written as 


Tu) — T(P(Ti, -< Tn) 


L?(R,p dt) 


d 
Fisher(f) = H(f + t!^g)|, o 


where q and f are independent and g is (0,1) 
Gaussian. 

The several-variables version is treated by using 
partial derivatives. 
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The analog in free probability of the Fisher 
information (Voiculescu 1998a) is obtained by 
using the free difference quotient derivations, 
which are the appropriate derivations in this 
maximally noncommutative setting. On the -poly- 
nomials in n noncommutative indeterminates, the 
kth partial free difference quotient 


WA M CN A ROT Ii NE Aa 


is defined on noncommutative monomials by the 
formula 
Ok Xi + Xi, = D Xj, +++ X219 X4, Xi 


p 


If Xj=Xj,1<j <n, are noncommutative ran- 
dom variables in (M,7), which do not satisty any 
nontrivial algebraic relations, to simplify matters we 
can assume that M is generated by Xj,...,X, and 
identify M with C(X,,..., X,). The trace state 7 
gives rise to a scalar product (m1, m2) = r(m5m) on 
M. Let L^(M,7) denote the Hilbert space obtained 
from M. Then, skipping some technicalities, O will 
give rise to a densely defined operator of L^(M, 7) 
into L^(M,7) @ L'(M,7). If 1 @ 1 is in the domain of 
the adjoints 0;, the free Fisher information of the 
n-tuple X1,..., X, is defined to be 


$*(X,,...,X,) = D> llt (1 @ Dz 


1<k<n 


In case 1 @ 1 is not in the domain of some ój, the 
free Fisher information is given the value oo. 
The “microstates-free free entropy" x" is then 


defined by 


V OG... Xa) = Tog ae 


«| (= E 
Jo Nl ot 


x(X 4-272854... Xs "^s, Ja 


where $4,,...,S, are (0,1)-semicircular and freely 
independent and also freely independent of 
(X1, e , X4]. 

For n=1 it is known that x*(X)= x(X) and the 
free Fisher information is 


$*(X)— = | mod 


if p(t) is the density with respect to the Lebesgue 
measure of the distribution of X. The computation 
of 0*1 & 1 is possible in the one-variable case and up 
to a factor the result is (Hp)(X), where Hp is the 
Hilbert transform of p. 
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Several of the classical inequalities for the Fisher 
information have free probability analogs 
(Voiculescu 1998a) (Cramer—Rao inequality, Stam 
inequality, information-log-Sobolev inequality, and 
others). 

For n > 1 only x € x', the easier of the inequal- 
ities among x and x^, has been established (Biane 
et al. 2003). This result was obtained based on an 
important connection of xy and x* to large deviations. 
The deviations studied are for the noncommutative 
distributions of z-tuples of matrices in the case of an 
n-tuple of Gaussian random matrices. In this context 
x is related to the quantity to be estimated and x* is 
related to the rate function. 

For more details on free entropy, the reader is 
referred to the survey articles by Voiculescu (1998c, 
2002). 


Concluding Comments 


For more details, additional results, and 
bibliography, we refer the reader to the exposi- 
tions in Voiculescu (1998c), Voiculescu et al. 
(1992) and Speicher (1998). To get even more 
detail, the reader may consult, besides the original 
papers of the present author, those of P Biane, 
R Speicher, D Shlyakhtenko, K J Dykema, A Nica, 
U Haagerup, H Bercovici, L Ge, F Radulescu, 
A Guionnet, T Cabanal-Duvillard, M Anshelevich, 
to name a few of the main contributors. 

Also, via random matrices, there are connections 
to physics models (especially large-N 2D Yang-Mills 
QCD) in work of I M Singer, M Douglas, D Gross- 
R Gopakumar, P Zinn—Justin. In a loose sense, one 
may view the noncrossing partitions combinatorics 
as related to the work on planar diagrams and the 
large-N limit of t Hooft and Brezin-Itzykson-Parisi- 
Zuber in the 1970s. 


See also: Large Deviations in Equilibrium Statistical 
Mechanics; Large-N and Topological Strings; Random 
Matrix Theory in Physics. 
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Introduction 


Functional equations have a long and interesting 
history in connection with mathematical physics and 
touch upon many branches of mathematics. They 
have arisen in the context of both classical and 
quantum completely integrable systems in several 
different ways and we shall survey some of these. 

In the great majority of cases functional equations 
appear in the integrable system setting as the result 
of an ansatz: a particular form of a solution is either 
guessed or postulated, the consistency of which 
yields a functional equation. What the ansatz is for 
can vary significantly. As outlined below, amongst 
others, one may postulate algebraic structures in the 
form of the existence of a Lax pair or of conserved 
quantities; in the quantum setting, one may postu- 
late properties of a ground-state wave function or 
the ring of commuting differential operators. 
Appearing in this way, functional equations are 
really just another of the (significant) tools-of-the- 
trade for constructing and discovering new integr- 
able systems. However, as one surveys both the 
functional equations and the functions they describe 
one sees certain common features. The functions are 
most frequently associated with an elliptic curve, a 
genus-1 abelian variety. One can seek to associate 
these to another fundamental ingredient of modern 
integrable systems, the Baker-Akhiezer function. 
Indeed, very few of the ansátze made directly 
suggest that the systems being constructed will be 
completely integrable. This very desirable property 
usually is a bonus of the construction and hints of 
more fundamental connections. Another fundamen- 
tal connection we shall mention is that with 
topology. The phase space of a completely integr- 
able system is rather special, admitting (generically) 
a foliation by tori. The functional equations we 
encounter often also characterize the Hirzebruch 
genera associated with the index theorems of known 
elliptic operators. These are typically evaluated by 
Atiyah-Bott fixed-point theorems for circle actions 
on the manifold. A general understanding of the 
various interconnections has yet to be achieved. 

To bring to focus our discussion we shall concen- 
trate on functional equations arising from studying 
systems with an arbitrary number of particles 
(n below). In principle, there could be many different 
interactions between the particles and symmetry will 


be used to limit these. The use of symmetry is a key 
ingredient, often implicit, in the various ansátze we 
shall describe. For simplicity, we shall most often focus 
on the situation where the particles are identical. In 
algebraic terms, we focus on the symmetric group S, 
and root systems of type a,; generalizations frequently 
exist for other root systems and Weyl groups and we 
shall simply note this at the outset. 


Lax Pairs 


The modern approach to integrable systems is to 
utilize a Lax pair, that is, a pair of matrices L, M such 
that the zero curvature condition L=[L,M] is 
equivalent to the equations of motion. By construc- 
tion, Lax pairs produce the conserved quantities tr L£. 
To establish integrability, one must further show both 
that there are enough functionally independent con- 
served quantities and that these are in involution. 
(R-matrices are the additional ingredient of the 
modern approach to establishing involutivity.) Lax 
pairs can fail on both counts, and so the construction 
of a Lax pair is but the first step in establishing a 
system to be completely integrable. The great merit of 
the modern approach is that it provides a unified 
framework for treating the many disparate completely 
integrable systems known. Unfortunately the construc- 
tion of a Lax pair is often far from straightforward and 
typically hides the “clever tricks" frequently employed 
in establishing integrability. In the present context, we 
shall outline how functional equations have been used 
to construct Lax pairs. The paradigm for this approach 
is the Calogero-Moser system. 
Beginning with the ansatz (for n x n matrices) 


Lik = pjóg + g(1 — jr) Aldi — qr) 


Mi, =8 ók 》 Big 


[fj 


- (1 — 6%) C(4j — 4k) 


one finds L — [L, M] yields the equations of motion 
for the Hamiltonian system (n > 3) 


H = 37 +g X U(q " 
j<k 
U(x) = nme + const. 


provided C(x)= —A'(x) and that A(x) and B(x) 
satisfy the functional equation 


A(x + y)[B(x) — B(y)] = A(x)A' (y) - A()A' (x) [2] 


This is a particular example of a more general 
functional equation whose solution will be described 
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below. For the present, we simply note that for this 
system the corresponding potential is the Weier- 
strass o-function, A(x)A(—x) = o(v) — p(x), and the 
resulting Hamiltonian system [1] is known as the 
Calogero—Moser system. It is completely integrable 
though, as already remarked, the ansatz did not 
necessitate this. The Lax pair presented here and the 
reduction of its consistency to a functional equation 
and algebraic constraints follows Calogero (1976) in 
which he discovered the elliptic generalization of the 
model he had introduced in 1975. 
A different ansatz for a Lax pair is 


Lik = q;6; + (1 — 63)4/ qjdyA(dj — dk) 
Mj, = 6k > Bq 


Ij 


+ (1—654)4/q;qy C(dj — dk) 


Now the consistency of the Lax pair yields equations 
of motion of the form 


j; = > jd, V(dj — ar) 


k#j 
A(x) A(Cx) 
V(x) = = —V(-x) 
C(x) CCx) 
provided B(x) = B(—x), C(x) 2 A'(x) — A(x)G(x), 
where we have defined G(x) = B(x) + (1/2)V(x 


and the functions satisfy the functional equation 


A(x) Aly) 
A(x+y) = AAO) 二 SO 
A(x) AQ) 
_ |x) Cii 
= GG) GO) 5 


Again we shall briefly defer describing the solution 
of this equation and simply note that the general 
solution for V(x) is again given in terms of the 
Weierstrass -function | V(x) ^ o'(x)/(o(v) — o(x)) 
and that the equations of motion follow from the 
Hamiltonian 


u- Xe pe 


j kzj 


(v) — e(q; — 


This is known as the Ruijsenaars-Schneider model 
and it too is completely integrable. The Lax pair 
here was constructed by Bruschi and Calogero. 

In the two examples of Lax pairs just presented, 
each particle interacts with every other pairwise. By 
modifying the ansatz, it is possible to construct 
models that interact with just their nearest neighbors 
(which include the Toda systems). More generally, 


an ansatz exists for a Lax pair associated with 
equations of motion of the form 


Gj = X (a+ bq;)(a + bår) Vilaj- ae) 4 
k#j 


which unifies, for example, the Calogero—Moser, 
Ruijsenaars-Schneider, and Toda systems. The 
functional equations now encountered are typically 
(and whenever b40) of the form 


p(X) 2(y) 

o3(x) 3(y) 
人 

ps(x) os(y) 


This functional equation, for five a priori unknown 
functions, includes [2] and [3] as special cases. 

The general analytic solution of [5] is, up to 
symmetries, given by 


ste (xin) Wein " P (x; 11) 
P(x; 12)’ $3 (x) P(x; 1) 


bej » on V2) | 
s(x) 中 (x; v2) 


. = a(v =X) v)x 
D(x: V) = soit [3 


where 


Here, C(x) = o(x) /o(x) is the Weierstrass ¢-function. 
The solution of [2] arises as the v; — 0 limit of [5]. 
The proof of the general solution just stated 
is in fact constructive (Braden and Buchstaber 
1997). The parameters appearing in the solution 
are determined as follows. Suppose xo is a generic 
point for [5]. Then (for & — 1,2), we have that 


P2k y + Xo) 
$2141 y + Xo) 


bap (x + xo) 
Oy In 
Qop (X + xo) 


= Ç(vk) — C(x) 


y=0 
— C(% — x) — rz 
/十 1 


aded. g e 7 
j aT ray u 


The Laurent expansion determines the parameters 


21,22 (which are the same for both k=1,2) 
characterizing the elliptic functions of [6] by 

g —-3(R-6F), g3 =6F} — Ft +3FoF, 
and the parameters rv, via Fo= —((v,). Here, 


(x)= —C'(x 


) is the Weierstrass elliptic -function 


with periods 2w,2w that satisfies the differential 
equation 


P(x) =4 (x)? — gae(x) — gs 


The constructive nature of the solutions of [5] means 
that it is straightforward to construct solutions to 
various specializations of the equation such as 


o1(x + y) = b4(x)bs(y) + b4(y) s(x) 


(obtained by requiring ¢2(x)=43(x) and 3(x)= 
%(x)). More complicated functional equations such as 


—W»(x + y)óz(x)oa(y) 
+ W3(x + y)oa(x)ós(y) [8] 


may be solved using the solutions of [5]. 

Finally, let us note that the general system [4] may 
lead to functional equations not just of the form [5], 
for example, 


Wi(x+y) 


hı (x + ¥)(b4(x) 一 
2(x) ne 
p(x) 5(y) 
The general analytic solution to [9] has yet to be 
determined although particular solutions are known. 


As a final example of a functional equation 
coming from an ansatz for a Lax pair, consider 


Li = J/bjby,A(q Miz = A/Pjby C(q 


where we now assume A(0) and C(0) regular. Then 
the consistency of this Lax pair corresponds to the 
equations of motion for the Hamiltonian 


plx +y) = 


H = N piPel (aj = dk) [10] 
ik 


provided f is even and the functional equation 
2A'(x + y) f(x) — fO] 


A 
— A(x + y)lf'(x) — fà») = | » 


Cx) 


A(y) 


mal M 


is satisfied. The Hamiltonian system [10] corre- 
sponds to geodesic motion. Nonanalytic solutions 
are known to the functional equation [11]. 


An Algebraic Ansatz: Conserved 
Quantities 


Another way in which functional equations may 
appear is by making an ansatz for an additional 
conserved quantity beyond the Hamiltonian. For two 
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and three particles on the line, Hietarinta derived 
functional equations by seeking a second quartic or 
cubic integral (respectively). Here, a key ingredient is 
the assumption of a further invariant polynomial in the 
momenta. Polynomial invariance, together with sym- 
metry, is quite constraining. Consider 


Theorem 1 Let H and P be tbe (natural) Hamilto- 
nian and center of mass momentum 


7 1 n 2 5 n | 
Denote by O an independent third-order quantity 
Ü= Yoel 5» dkbibibk + 》 dip?p; 
6 LA iF] 


1 
t3 3 adüpip; + 2. bipi +c 
If these are S,-invariant and Poisson-commute, 


iP, Hj = {P,Q} = {Q, H} = 0 


then 


m qi) + const. 


and we have the Calogero—Moser system. 


Here, the symmetric group invariance means that 
for any coefficient ajj(q1,q2,---;4n) in the expan- 
sions above, we have Qo(i)q(j)(Jo(1)s de(2) +++» qotn)) for 
all o € S,. In particular, V(qi,q2,..., qs) = V(doi1)s 
dol2)» - 5 doln)) for all o € S,. We remark that had we 
begun with particles of possibly different particle 
masses, H=(1/2) 37 ,mip? + V; the effect of 
$,-invariance is such as to require these masses to 
be the same. Thus, we are assuming the $,-invariant 
Hamiltonian of the theorem. Finally, by “an 
independent third-order quantity” O, we mean one 
functionally independent of H and P and for which 
one cannot obtain an invariant of lower degree by 
subtracting multiples of P? and PH. We are not 
dealing with quadratic conserved quantities here. 

The assumed polynomial behavior of the con- 
served quantities means that when calculating 
Poisson brackets, the coefficients of independent 
monomials must vanish. This, together with sym- 
metry, leads to the functional equation 


1 1 1 
F(x) F(y F(z)|=0, x+y+z=0 [12] 
P(x) F(y) F(z) 
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The result follows in light of 


Theorem 2 Let f be a three-times differentiable 
function satisfying the functional equation [12]. Up 
to the manifest invariance 


F(x) — aF(éx) + 8 


the solutions of [12] are one of F(x)= (x+ d), 
F(x)=e* or F(x)=x. Here, @ is tbe Weierstrass 
-function and 3d is a lattice point of tbe -function. 


Again we note that the ansatz per se has not 
established complete integrability: the ansatz leads 
us to the Calogero-Moser model whose complete 
integrability must be established by other means. 
This result may be interpreted as a rigidity theorem 
for the a, Calogero-Moser system and in part 
explains this models’ ubiquity: demanding a cubic 
invariant together with S,-invariance necessitates 
the model. A natural generalization is to replace the 
S,-invariance with the invariance of a general 
Weyl group W and make connection with the 
Calogero-Moser models associated to other root 
systems (Perelomov 1990). 

We shall encounter the functional equation [12] 
again in this survey and now note that this may be 
generalized to 


1 1 1 
F(x) G(y H(z)|=0, x+y+z=0 [13] 
F(x) CGO)  H'(z) 


If F, G, and H are three-times differentiable func- 
tions satisfying the functional equation [13], then, 
up to the manifest invariance, 


F(x) — aF(6x 4- ^1) + 8 
G(x) — aG(éóx 4- 32) 4- 8 
H(x) — aH(óx + 33) - 8 


where ^ + 72 + %3 — 0, the nonconstant solutions of 
[13] are given by F(x) = G(x) = H(x) =e*, x, or p(x). 
If (say) H(z) is a constant then either 


1. one of the functions F(x) or G(y) is the same 
constant as H(z), in which case the remaining 
function is arbitrary, or 


2. Fix) = G(x) =e. 


We remark that in fact the exponential and linear 
function solutions satisfy [12] and [13] without the 
constraint x+y+z=0. Further, the theorems 
immediately give the general analytic solutions to 


the same functional equations viewed as functions of 
a complex variable, showing that the solutions are 
in fact meromorphic. These theorems were estab- 
lished in Braden and Byatt-Smith (1999) where 
earlier results are described. 


Quantum Calogero-Moser Systems 


Quite a bit is known about the quantum general- 
izations of the Calogero-Moser system. The poly- 
nomial and Weyl group W-invariance of the 
classical conserved quantities is replaced by a 
commutative ring R of W-invariant, holomorphic, 
differential operators, whose highest-order terms 
generate W-invariant differential operators with 
constant coefficients. The Poisson bracket is then 
replaced by a commutator of operators. When this 
is done functional equations again ensue and one 
finds that the potential term for the Laplacian H 
(the quantum Hamiltonian) has Calogero—Moser 
potential appropriate to W (Oshima and Sekiguchi 
1995). In this setting, it is known that the 
commutativity of just a few low-order elements of 
R dictate the form of the potential and the 
commuting algebra (at least for the classical root 
systems). In particular, Theorem 1 above is the classical 
analog for the a, root system of a quantum result where 
a functional equation equivalent to [12] was obtained 
by requiring the commutativity of certain linear, 
quadratic, and cubic holomorphic differential opera- 
tors. Taniguchi's results (Taniguchi 1997) are also 
indicative of the rigidity of these quantum models: if H 
is the quantum Hamiltonian just discussed, and 9, > 
are holomorphic (but not a priori W-invariant), 
differential operators of appropriate degrees for 
which [Q; 5,71] ^ 0, then Q; » € R and consequently 
[Q1, 95] — 0. 


An Algebraic Ansatz: The Poincaré 
Algebra 


We have earlier encountered the Ruijssenaars- 
Schneider models when considering functional 
equations ensuing from ansatz for Lax pairs. These 
models were however discovered by another route 
(Ruijsenaars and Schneider 1986) in the course 
of investigating mechanical models obeying the 
Poincaré algebra 

(H,B) a P, P B) =H, {H,P}=0 [14] 
Here, H will be the Hamiltonian of the system 
generating time translations, P is a space-translation 


generator, and B the generator of boosts. Ruijsenaars 
and Schneider began with the ansatz 


n= > cosh, I/C x; — Xp) 
kj 

Ps > sm [Its X; — Xz) 
kzj 


B = Sy 
j=1 


With this ansatz and the canonical Poisson bracket 
(pi, xi} = ői, the first two Poisson brackets of [14] 
involving the boost operator B are automatically 
satisfied. The remaining Poisson bracket is then 


(H,P) =- 3:8 Ts 


j1l kĖj 


- 53 cosh(p, by) | [f(x — x) 


L Iz j 


x [| Fr- xm) (0; Inf (xr — xj) 


mk 


T On In f (x; 一 XE )) 

and for the independent terms proportional to 
cosh (p; — pz) to vanish we require that f'(x)/f(x) be 
odd. This entails that f(x) is either even or odd 
(Ruijsenaars and Schneider assumed the function even) 
and in either case F(x) = f^(x) is even. Supposing that 
f (x) is so constrained, then the final Poisson bracket is 
equivalent to the functional equation 


(H,P) -0— Ya [T £s x —x,)-20 [15 


1  k£gj 


For n=3, eqn [15] takes precisely the form [12] 
with F(x)=f*(x). From Theorem 2, the even 
solutions to this have the form F(x)-— (x) 4 c. 
This was found by Ruijsenaars and Schneider who 
further showed this function satisfies [15] for all 7. 


The general solution to [15] has recently been 
established. 


Theorem 3 (Byatt-Smith and Braden 2003). The 
general even solution of |15] amongst the class of 
meromorpbic functions wbose only singularities on 
the real axis are either a double pole at the origin, or 
double poles at np (p real, n € Z) is: 


(i) for all odd n given by the solution of Ruijsenaars 
and Schneider while 
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(ii) for even n> 4, there are in addition to the 
Ruijsenaars—Schneider solutions the following: 


(p(z) — ej)(o(z) — ex) [16] 


where i, j, k are a cyclic permutation of 1, 2, 3. 


Fi(z) = 


These functions have simple expressions in terms 
of Weierstrass elliptic functions, theta functions, and 
the Jacobi elliptic functions (Whittaker and Watson 
1927). For example, 


Fi(z) = V(@@) -aee e) = 2 
o~(z) 
z= 03(v)04(v) OFO) — = h dn(u) 
(Vv) 4w203(0)04(0)  sn?(u) 
where 
PS E RS TE erate 
u = Ve1 — eaz 


v=z/2w,b =e; — e with wi =w, wz = —w — w', and 
w3=w'. For appropriate ranges of z the solutions 
are real. Their degenerations yield all the even 
solutions with only a double pole at x —0 on 
the real axis. These degenerations may in fact 
coincide with the degenerations of the Ruijsenaars- 
Schneider solution. 

Thus far, complete integrability has not been 
mentioned. The models discovered by Ruijsenaars 
and Schneider not only exhibited an action of the 
Poincaré algebra but were completely integrable as 
well. In particular, Ruijsenaars and Schneider 
demonstrated the Poisson commutativity for their 
solutions of the light-cone quantities 


in T (Sp) aa m7 
IC(1.2.....n] icl "e 
=k 
Then, H = (Sı + $.1)/2 and P=(S; — S 4)/2. (Note 


the even/oddness of the functions f (x) means that there 
really are only » functionally independent quantities.) 
It is an open problem whether the new solutions [16] 
of Theorem 3 yield integrable systems. We know that 
these new solutions do not always yield Poisson 
commuting quantities using the ansatz of Ruijsenaars 
and Schneider, but as yet one cannot rule out other 
Poisson commuting conserved quantities. 


Quantum Ruijsenaars-Schneider Models 


Ruijsenaars later investigated the quantum version 
of the classical models he and Schneider introduced. 
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From the outset, he sought operator analogs of the 
light-cone quantities [17]. He showed that (for 
k=l.. A) 

S; = | [bx; = x;) ^ 


IC(1,...,n) icl 
=k jl 


op (~ —18 》 [Ib(xi— zx) 
i€El icl 
igl 


pairwise commute if and only if 


| | 4 — db: — x; — i8) 
IC(1,2,...,n) V iel 
=k — Vel 
- [[^(xi ^ x;)h —iß) | =0 [18] 
ic] 
jel 


held for all k and n > 1. Here, B is an arbitrary 
positive number and the sum is over all subsets with 
k elements. Observe that upon dividing [18] by 
B and letting 8— 0 this yields [15] with F(x)— 
b(x)b(—x) when k= 1. 

Ruijsenaars found a solution to [18] which has 
subsequently been shown to be unique. The general 
solution of the functional equation [18] analytic in a 
neighborhood of the real axis with either a simple 
pole at the origin or an array of such poles at mp on 
the real axis (7 € Z) is given by 


dix t v) as 
a(x) o(v) ~ 


This solution is related to the earlier Ruijsenaars— 
Schneider solution via 


h(x) = b———— [19] 


olx 4- v)o(x — v) = 
o*(x)o*(v) 


Geometric Ansatz 


We have already encountered the Hamiltonian 
system [10] corresponding to geodesic motion 
while discussing Lax pairs. We shall now consider 
various ansatze with a geometric flavor and their 
attendant functional equations. 

It is known that the Ruijsenaars-Schneider model 
has the Calogero-Moser system as a scaling limit. 
Other scaling limits also exist for the Ruijsenaars- 
Schneider model. In particular, we may consider one 
in which the Poincaré algebra scales to either the 
Galilean algebra or a central extension of the 
Galilean algebra. 


Similar to our analysis of the Poincaré algebra, we 
find that the functions 


-1y 5 Ifc — Xk), 
ur k#j 
P= Y^» TT; X; — Xp), B=) s 
j=l  k£zj 三 1 
obey the algebra 
[H,B] -P, {P,B}=, {H,P}=0 [20] 

if and only if f(x) is either an even or odd function 
satisfying 


Dy [fe — Xp) =A [21] 


where A is a constant. When A=0 this is the 
Galilean algebra, while A#0 is a central extension 
of the Galilean algebra. Again we are aamiin 
models of the form H=(1/2)>*"_,g’p; and so 
dealing with diagonal metrics. We note that if [21] 
holds for »=3 then it holds for all n; and if it 
holds for 2 —4 then it holds for all “even” n. This 
type of behavior was already encountered in 
Theorem 3. 

Some particular solutions of [21] are known 
although the general solution is not known as yet. 
The odd functions f(x) =1/x (A=0), coth (x) (A=1 
for n odd and A— 0 for n even), vV (x) — ea (A=0) 
yield solutions for example. Interestingly, in the 
case of an even number of particles, particular cases 
of the elliptic Ruijsenaars-Schneider model are in 
this list. 

Diagonal metrics arise in many settings in integr- 
able systems. By taking the ansatz 


ds? = D (II V (x! 一 2 (dx)? 


i=1 Ni 


we may construct and solve a functional equation to 

show that the potentially nonvanishing curvature 
i i A 1 

components Rigs Ri (R # i,j), and Ri; have 


d R^ = Ri, =O (k#i,j) if and only if 
W(x) — a(e?^* — 1)* or ax’. We may set a=1 by 
rescaling x. 

2. Ri, ;三 (~ 1)"b* when — 

3j. RE. = 0 when W(x) = 


jij 

Thus, V(x)—x yields a solution of the Lamé 
equations. These metrics are of Stäckel form. The 
rational degenerations of the Galilean models above 
are given by this theorem. They may be understood 


as a parabolic limit of Jacobi elliptic coordinates. 


(e?bx =I. 


Similar techniques may be applied to the more 
general metric 


d? = > c IT ve- 2 (dx 
]£i 

to show that Ri —R;,—0(kz ij) if and only 
if W(x) = o(e?"* — 1)* or ax’. 


Ground-State Factorization 


Some years ago, Sutherland and Calogero consid- 
ered the problem as to when the ground-state wave 
function of a one-dimensional z-body Schródinger 
equation with pairwise interactions would factorize. 
Thus, the problem is to determine those potentials 
v(x) for which 


and where 


Wi (204, %2 i Xq) = ] vox =y) 


It is convenient to set 


(x; — xj) = exp (5 | < f (x) dx) 


Substitution now shows 


p? n P 
rp -bIr = yt 2. fes E xj 
+ X [f(e — x)f (xi — xk) 
i<j<k 


= flap fox) 
+ f (xj — xp) f (xi — x)| E 


Comparison with [22] shows that this may be 
expressed in terms of two-body potentials if and 
only if we have the functional equation 


f(a)f(—b) — f (af (c) + f of (—5) 
= G1(a) + Gi(c) + G2(-b), at+b+c=0 [23] 


Now [23] is not quite the functional equation 
studied by Sutherland and Calogero. On physical 
grounds, Sutherland implicitly, and Calogero expli- 
citly, made the “assumption” that f is an odd 
function. This ensured that the potential was even 
and so bounded from below; equally it may be 
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imposed so that wv(x;—x;j)— v(x;— xj) and the 
ground state describes bosons. With this assump- 
tion, one arrives at the functional equation of 
Sutherland: 


f (a)f (b) + f (Df (c) + f Cof (a) 
= G(a) + G(b) + G(c) [24] 


Actually the assumption of f being odd is unneces- 
sary. One can show that there is a bijection between 
analytic solutions of [23] and analytic solutions of 
[24] for which f'(x) is even. Upon requiring a 
potential of the stated form then necessitates f 
being odd. Whatever, we arrive at the functional 
equation [24]. This is connected with [12] by 


Lemma 4 Ifa+b+c=0, then 


f'(a) f'(b) f(c) 
f'(a) f(b) fi(c)|=0 i25] 
1 1 1 


<=> (f(a) + f(b) + f(c))"= F(a) + F(b) + F(c) [26] 


= f (a)f (b) + f(b)f(e) + f Cof (a) 
= G(a) + G(b) + G(c) [27] 


Now, we may use Theorem 2 to determine those 
potentials with factorizable ground-state wave func- 
tions. We remark that the 6-function potential aó(x) 
of many-body quantum mechanics on the line, 
which also has a factorizable ground-state wave 
function, can be viewed as the a 一 0 limit of 
—b/osinh^(-x/a + zi/3) with maa=6b. Thus, all 
of the known quantum mechanical problems with 
factorizable ground-state wave function are included 
in [12]. 


Baker-Akiezer Functions 


Baker-Akiezer functions are one of the foundations 
of the algebro-geometric or finite-gap integration of 
integrable systems. These functions may be viewed 
as an extension of the exponential function to curves 
of arbitrary genus g. They have essential singula- 
rities at various points on the curve and a prescribed 
asymptotic expansion at these points. The functions 
may be described in terms of theta functions on the 
Jacobian of the curve, and suitable meromorphic 
differentials on the curve. The functions [6] and [19] 
may be viewed as the Baker-Akhiezer function for a 
genus-1 curve. Now, just as the exponential function 
satisfies Cauchy's functional equation one may ask 
what functional equations (if any) characterize the 
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Baker-Akiezer function. This is an area of research 
still ongoing. Theta functions of a general abelian 
variety are known to satisfy addition formulas with 
N=28 terms. It appears Baker-Akiezer functions 
satisfy a similar functional equation with far fewer 
terms. Such a characterization of Baker—Akiezer 
functions, if found, will provide an analogous 
answer to that of the Riemann-Schottky problem 
which seeks to describe the Jacobians of curves 
amongst general abelian varieties. 

The functional equations [5], and after suitably 
symmetrizing [9], are particular cases of the func- 
tional equation 


$3i1(X) O3ina(y) | _ 
Q3i42(X) 3i42(y) 2 28 


with N=1 in the former case and N —2 in the 
latter. In the case ģ3i+2 = $5,, ,, these may be viewed 
as differentiated forms of 


N 
X o3i(x + y) 
i=0 


N 
p» Pailx + y)ósici(x)ósca (y) = 1 [29] 


i=0 


For N —0, this is Cauchy’s equation characterizing 
the exponential function and for N=2 it is 
equivalent to [8]. For N=1 and N — 2, Buchstaber 
and Krichever have shown that “all” the solutions to 
this equation are the Baker-Akhiezer functions 
corresponding to algebraic curves of genus 1 and 
2, respectively. In general, the Baker-Akhiezer 
functions for a genus-g curve are known to satisfy 
[29] for N =g. Thus, many of the equations we have 
encountered are related to Baker-Akhiezer func- 
tions. Dubrovin, Fokas, and Santini have shown that 
Baker-Akhiezer functions for a genus-g curve are 
related to the functional equation 


& 


q(x, y)q(y. z) 
= r(x, y) — r( y) + SL Cy)p (x,z) 
q(x, z) à 2. _— 


Multivariable generalizations of [29] have been sought 
as a means of characterizing Baker-Akhiezer functions 
but such a characterization remains unproved as yet. 


Topology 


Several of the functional equations we have encoun- 
tered also arise in topology, where the German and 
Russian schools have powerfully applied functional 
equations to formal group laws and genera. It is still 
unclear whether these common threads form part of 
a greater fabric. A genus is a ring homomorphism 


y:2@Q—>R, p=} 


where Q is the cobordism ring and R an integral 
domain over Q. To each even power series O(x) with 
Q(0) = 1, one can associate a genus Yo and vice versa 
(Hirzebruch et al. 1992). Defining the odd power 
series f(x) =x/QO(x) with first term 1 and coefficients 
in R, the inverse function g — f! is such that 


9 d 


(y) = 》 eo(CP^)y^ 


n=() 


The genus corresponding to O(x)=x/tanh(x) is 
known as the L-genus; it takes the value 1 on every 
even complex projective space. The genus corre- 
sponding to O(x) — (x/2)/sinh(x/2) is known as the 
A-genus. The so-called (string-inspired) Witten or 
elliptic genus corresponds to Q(x) — x/o(x). Certain 
genera may be associated with the index of natural 
differential operators on the manifold. Thus, the 
signature of M, sign(M), is given in terms of the de 
Rham differential d and its adjoint d*, 


ind(d + d*) = sign(M) = (lee [M] 


ar tanh(x;) 


with variants for the A-genus and elliptic genus. 
Further, when a compact topological group acts on 
the manifold, Atiyah and Bott showed how these 
indices may be determined from the fixed point sets 
of the action. 

Now, functional equations arise naturally in this 
context when seeking genera with special properties. 
Novikov’s school has shown, for example, that the 
genera associated with the index theorems of known 
elliptic operators arise as solutions of functional 
equations which are particular examples of [5]. 
Similarly, one may seek the following property of 
a genus y: for the fiber bundle p:E—>B with 
smooth fiber and base, one has that 


p(E) = (F) - p(B) 


Such a genus is said to be strictly multiplicative. It 
may be shown that a genus is strictly multiplicative 
in bundles with fiber CP”! if and only if 


n 1 
ll re iti 


which is essentially [21]. Following the remarks of 
that equation, a genus y is strictly multiplicative for 
all fiber bundles with fibers CP” ' if and only if it is 
strictly multiplicative for all fiber bundles with fiber 
CP. in which case the genus is the L-genus. If, on the 
other hand, we only demand strict multiplicativity for 
all fiber bundles with fibers CP7*~', then this is 
equivalent to requiring it to hold for all fiber bundles 
with fiber CP’, in which case the genus is an elliptic 


genus. That the same functional equations arise in 
both the integrable systems and topological settings 
may reflect something deeper. String theory physics, 
for example, allows some topology changes such as 
flops, and physical quantities such as the partition 
function should reflect this invariance; invariance 
under classical flops characterizes the elliptic genus. 
In addition, connections have been made between the 
complex cobordism ring and conformal field theory. 


Other Areas 


The constraints placed on this review have meant that 
several further applications of functional equations 
and integrable systems can only be noted. Using an 
ansatz together with functional equations, Wojcie- 
chowski gives an analog of the Backlund transforma- 
tion for integrable many-body systems. Similarly, 
Inozemtsev constructs generalizations of the 
Calogero—Moser models, while this route was used to 
construct new solutions to the Witten—Dijkgraaf- 
Verlinde-Verlinde (WDVV) equations by Braden, 
Marshakov, Mironov, and Morozov. In the quantum 
regime, Gutkin derived and solved several functional 
relations by requiring a nondiffractive potential, while 
functional equations have been used to construct 
R-operators, solutions of the quantum Yang-Baxter 
equation on a function space. 


See also: Calogero—Moser—Sutherland Systems of 
Nonrelativistic and Relativistic Type; Classical r-matrices, 
Lie Bialgebras, and Poisson Lie Groups; Cohomology 
Theories; Eigenfunctions of Quantum Completely 
Integrable Systems; Integrability and Quantum Field 
Theory; Integrable Systems and Algebraic Geometry; 
Integrable Systems: Overview; Lie Groups: General 
Theory; Quantum Calogero-Moser Systems; Toda 
Lattices; WDVV Equations and Frobenius Manifolds. 
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The Domain of Integration 


Functional integration is integration over function 
spaces, that is, the variable of integration is a 
function f with values in a D-dimensional manifold: 


f;U—MP? [1] 


Generically a space of functions is an infinite- 
dimensional space. Our understanding of infinite- 
dimensional spaces has progressed significantly 
during the twentieth century, and we can formulate 
functional integration in its proper setting. 

Let F be the domain of integration, and f € F 
the variable of integration. If the domain of f is 
a subset U of R, the functional integral is called 
a path integral; if U is of dimension higher than 1 
(e.g., spacetime), F is often called a space of 
histories. 

The information necessary for defining a domain 
of integration includes 


e the domain and the range of the variable of 
integration f, 

e the analytical properties of f, and 

e possibly additional information, such as require- 
ments on the values of f on its boundary. 


Examples of variables of integration f in [1] 


The domain U of f may be a time interval, a scale 
range, or any parameter. The range MP of f may be 
a group manifold, a Riemannian manifold, a 
symplectic manifold, a multiply-connected space, 
etc., or simply RP. The domain of integration F 
may be a space of pointed paths, for example, 


x:-T—MP,. T= [ttg [2] 


x(t) = xg € MP forallx € F 


The paths x may be continuous (e.g., Brownian 
paths), or may have square integrable derivatives; 
F is then an L^! space (e.g., quantum physics). 


| dt|x(t)| <o0, xEF [3] 
T 


Given a domain of integration F, one needs to 
select a volume element appropriate to F. This is a 
challenge which has been met in a number of cases 


(Cartier and DeWitt-Morette 2006). Examples are 
given below. Given a volume element, one can then 
characterize the functionals F on F integrable with 
respect to the chosen volume element. 


Two Basic Techniques 


The two most useful techniques for computing 
integrals are change of variable of integration and 
integration by parts. They follow from fundamental 
properties that apply to functional integrals as well 
as to ordinary integrals. Let us recall them in the 
context of ordinary integrals. 

Let f and g be functions on R of compact support. 
Let I stand for integration 


(f) = 人 de f(x), x€R 


and D for derivation of f with respect to x, 


DAE) = ftx) 
The fundamental rule 


d 
DI=0 => 0=— dx f (x E 
sje a 
The functional I(f) is invariant under a change of 
variable of integration. 
Another fundamental rule is 1D — 0: 


ID —0 => 0 = Eu 
- / df (x) - g(x) + J dg(x) - f(x) [5] 


The fundamental rules [4] and [5] apply to 
functional integration. The derivation D can be 
either a functional derivative or a Lie derivative 
defined as follows. Let K be the reals R or the 
complex C, let f be a differentiable functional on a 
Banach space X 


f:UCK—K [6] 


The functional derivative Df|, of f at xo is defined 
by the equation 


f (xo + b) — f (xo) = Df|,,.4 + R(b) [7] 


with the norm ||R(b)| of order less than the 
norm ||^||. 

The Lie derivative Ly along the vector field V is 
conceptually intuitive and of practical interest: an 
infinite-dimensional space X of paths x is not an 
intuitive concept, but a one-parameter family of 


M®2=R 


Figure 1 A one-parameter family of paths with fixed endpoints. 


paths {x(a)} € X, with o € [0,1], is a convenient 
tool for dealing with X: 


xla): T —^ MP " 
x(a, t) := (x(a))(t) € MP 


Set x(0) = xo. A differentiable family [x(o)] defines a 
vector field V(xo) along the path xo (see Figure 1): 


[9] 


The functional vector field V on the tangent bundle 
of X defines a group of transformations on X, and a 
Lie derivative Ly of tensor fields on X. 

The Lie derivative Ly obeys the Cartan (Elie and 
Henri) equation 


Ly = dw + rvd [10] 


where d is the exterior differential and zy is the 
interior product, defined as usual on Banach spaces. 


Remark (Berezin integrals). To show the power of 
the rules [4] and [5], we can mention that they 
provide Berezin rules of integration over Grassmann 
variables (Cartier and DeWitt-Morette 2006). 


Path Integrals and Quantum Dynamics 


The history of path integrals in quantum physics did 
not begin with the definitions of domain of integra- 
tion, volume elements, etc. It began with the Ph.D. 
thesis of R P Feynman in 1942. Feynman expressed 
the time evolution of a system as the limit N = oo of 
the following N-tuple integral: 


(Xz|Xa) = lim J [ [eodem 
X dxw-4 ::*(x2|x1) dx1 (x1|xo) [11] 
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where the time interval T — [t, 0] has been replaced 
by N of its points (Z], 1 € i € N: 

ig «ti «c: «tu«t [12] 
and the path x: T — RP is replaced by N of its values 
xj ze xt) [13] 


Dirac (1933) had shown that (x;|xo) defines the 
exponential of a quantum function So, by 


exp(iSo(x;, xo, t)/b) := (x;|xo) [14] 
such that the real part of So is the classical action 
function (a.k.a. Hamilton’s. principal function; 


further studies have shown that the correct state- 
ment is: the real part is the classical action, up to 
order 5), and the imaginary part of So is of order h, 
the normalized Planck constant 


b = b/2n (15] 


Feynman remarked that for a system with 
Lagrangian L the short-time probability amplitude 
(xi ,5;|x;) is “often equal to 


A exp(i dL (o — E) [16] 


within a normalizing constant A as the limit ôt 
approaches zero.” The absolute value of A can be 
obtained from a unitary requirement (Morette 1951). 

Feynman expressed the finite probability ampli- 
tude as a path integral, limit of the discretized 
expression |11] 


(x;|xo) = | Dvexpliste)/b) [17] 


where S(x) is the action functional 


s(x) = j dsL(x(s), x(s)) [18] 


The undefined symbol Dx is a “volume element” on 
the space of paths, corresponding to the infinite 
product of the normalization constant A™. 

The issues raised by the path integral [17] are 


e the definition of the volume element Dx; and 
e a method for computing [17] for a given action 
functional S. 


The explicit calculation of the limit [11] of an 
N-tuple integral when N — oo is a Herculean task of 
very limited use. But two other methods of wide 
applications, leaving the volume element Dx as a 
heuristic symbol, have vindicated the power of 
functional integration: the diagram technique and 
the semiclassical expansions. 
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Feynman devised practical rules for computing 
asymptotic expansions of path integrals, order by 
order in perturbation theory. The rules are depicted 
by graphs, known as the Feynman diagrams 
(t Hooft and Veltmann 1973). Feynman’s first 
explicit nontrivial calculation was the Lamb shift. 
It earned him the Nobel prize in 1965 (Feynman 
1966). The diagram technique is widely used in 
quantum mechanics and quantum field theory. The 
time ordering provided by the time parameter in 
quantum mechanics becomes, in quantum field 
theory, a chronological ordering dictated by light 
cones. 

Another explicit calculation of a path integral [17] 
uses the Taylor expansion of the action functional 
S(x) around one of its values. It is known as the 
background method (DeWitt 2004). It is called a 
semiclassical WKB approximation when one expands 
around an extremum S(x,.;) where x, is a solution of 
the Euler-Lagrange equation $'(x4)—O0 (Wenzel 
1926, Kramers 1927, Brillouin 1926). 

Introduced in 1951 (Morette (1951)) semiclassical 
approximations are now the subject of a rich 
literature reviewed briefly below. 


Gaussian Volume Elements 


A lesson from Gaussians on R” suggests a definition 
of volume elements on infinite-dimensional Banach 
spaces X. Let 


Ip(a) :— / dx exp(— ^ lxi") fora-0 [19] 
RD a 


D 
dx = dx! - - -dxP and |x|* = Y (x)- 6x! x! 


j=l 


An elementary calculation gives Ip(a)=a?/*. There- 
fore, when D — oo, 


0 fO0<¢4<1 
Isla) = | 1 ifa=1 [20] 
po EL. 


This is clearly an unsatisfactory situation, but it can 
be corrected by introducing a dimensionless volume 
element: 


] 


Dx =- 
4 qp/2 


dx! --- dx? [21] 
The volume element D,x can be defined by the integral 
T, 2 YA, 
人 pe exp(- F Ix|^ — 2ri(x x) 


= exp (—ar|x'|*) [22] 


where x’ is in the dual Rp of RP. Equation [22] 
suggests the following generalization of Gaussians 
on R? to Gaussians on a Banach space X: 


} Ds, ox exp(- ~ Q(x)) exp(—2ri(x',x)) 
:一 exp(—s7W(x’)) [23] 


where s € {1,i}, O(x) is a quadratic form on X (see 
condition on O below). W(x’) is a quadratic form on 
the dual X' of X, inverse of Q(x) in the following 
sense. Set 


O(x)=(Dx,x) and W(x’)= (x',Gx) [24] 


where (,) is a duality product, for example, the 
product of x € X and Dx € X’; then 


DG=ly, GD= 1x [25] 


Equation [23] defines a Gaussian volume element dT 
by its Fourier transform 


£T. o(x) SET 
:= exp(—s7W(x’)) [26] 


where the Gaussian volume element 


dT.o() D.o(x)exp(-2Q@)) 27 


This is a qualified equality valid upon integration. 

The definition of the Gaussian volume element by 
its Fourier transform FT is valid for s=1 (Wiener 
integral) when QO(x)>0; it is valid for s=i 
(Feynman integral) when ReO(x) > 0. 


Remark Volume elements were introduced with 
the notation such as dx; later they were identified 
with forms such as w= dx. In [26] we omit d on the 
left-hand side (LHS) for visual clarity. 


Example (diagram expansion). The following inte- 
grals follow readily (Cartier and DeWitt-Morette 
2006) from the definition [26]. Let x’ be in the dual 
X' of X, 


f drola), ay =0 28 
X 


/ " 2n! E i IH 
i dT. o (3) ,x)" = (5-) W(x)” [29] 


J ato) 1.2) = (am) 


ng (2) Y W(x'i, Xin) pe W(x in 12% tee) [30] 


where 57 is a sum without repetitions of identical 
terms. 


For instance when n= 1, eqn [30] reads 


S 


f ars 96065365.) = 5 W(x'1,x'2) [31] 
X T 


W(x'4, x'5) is called the two-point function (a.k.a. the 
propagator). In a diagram it stands for a line from 
x’; to x’. 

Feynman diagrams represent Gaussian integrals of 
polynomials. 

For instance when n=2, the diagram representa- 
tion of [30] is the sum of three terms, 


W (xs xh) W G4, 4) + W (x a) W (x) 
+ W(x a4) W(x4,%4) 


Example (Linear maps). Linear maps on R” are 
limited to L:x— Ax, where A is a D x D constant 
matrix. Linear maps on a Banach space X offer 
many possibilities: 


(i) Projections. For example, let x: T — R and 


L:x€X— {x(t),x(t),-.., x(t,)) € R”) [32] 


This projection is a discretization of the path, 
useful in particular in numerical calculations of path 
integrals. Equation [32] is unambiguous, whereas 
the limit of the discretized expression [11] is ill- 
defined. 

(ii) Liouville decomposition. For example, let D be 
a second-order differential operator on a space of 
paths x:[t,,t,] 一 MP vanishing on the boundary, 
x(t;) =0,x(t,) — 0. Let {p} be a complete, orthogo- 
nal set of eigenfunctions of D, then the decomposi- 
tion of x into the basis {yp}, 


x^(t) = Y wg) [33] 
k=1 
is a linear map 
l; 7x € X — (ul...) E R” 


It is useful in particular for diagonalizing (see, 
e.g., [107]) the Green function G of D [25] (a.k.a. 
the covariance in a Gaussian integral [24], or the 
two-point function in [31]). 

(iii) Volterra maps. For example, let L:X — Y by 


y(t) — n O(t — s)x(s) 


0(t—s) —1 fors<t,0 otherwise [34] 
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Figure 2 Linear maps. (Published with permission by Elsevier, 
North Holland.) 


Let X be the space of square-integrable functions 
on T and Y be an L^! space (square-integrable 
function for which the first derivative is also square 
integrable) then L maps the canonical quadratic 
form on X into the canonical quadratic form on Y, 
hence the canonical Gaussian on X into the 
canonical Gaussian on Y. The identity mapping 1 
from Y into the space C of continuous functions 
maps the canonical Gaussian on Y into the Wiener 
Gaussian on C (DeWitt-Morette et al. 1979). 


The linear maps [32]-[34] and their obvious 
generalizations have been used for computing 
explicitly many functional integrals (see Figure 2). 
The basic formula reads 


人 arxmFom= dry(y)f(y), F=foL [35] 
where the Fourier transform FT is given by 
Fly =FTxoL [36] 
L is the transpose of the linear map L defined by 
(Ly’,x) = (y, Lx) [37] 


Computing JT y does not require any calculation. It 
can be read off eqn [36]. Computing dT y is easy in a 
number of cases such as the following: 


1. Y is finite-dimensional. In other words [35] is 
a cylindrical integral. Then 


dy (y) 7 dy! --- dy? (detQ;) "s 
x exp(- 00) [38] 


where O(y) is an abbreviation of 
Oy(y) = Qvi y’y! [39] 
its inverseW y; (y') in the sense of [24]-|25] is 
Wy(y) = Wy vy [40] 
that is given by [36]: 
Wy (y) = Wy oL [41] 


438 Functional Integration in Quantum Physics 


Wy is the quadratic form defining Tx by [26]. 
When D is small, say less than 4, and this is not an 
unusual situation, it is easy to compute [38]. 


Example (Wiener Gaussians and Brownian 
motion). The Wiener Gaussian on the space PR 
of pointed paths x:T — R, T=[t,, ty], x(ta)=0 is 
defined by its variance [26]: 


W(x’) := i dx (t^ | dx'(s)inf(t,s) [42] 


Let Y be the Wiener differential space consisting of 
the differences of two consecutive values of x on the 
n-discretized time interval. The space Y is finite 
dimensional, 


L:XY 
by y! = x(tj.1) — x(t;) = (Ôt; — 


It follows from [37] that 


Ly E >. y; (5. 7 r, ) 
J 


01,, X) 


and 


dry(Ax) 2 dy! ---dy” 


\2 
x exp [- => 2 [43] 


where At; :=t)41 — t; and Axj:— x(tj1)— x(t;). 
When s— 1 the Gaussian Ty defines the distribu- 
tion of a Brownian path. The Gaussian Lx of 
covariance inf(t,s) is the Wiener measure. [] 
2. In semiclassical approximations, Ox is the 
Hessian (second variation) of an action functional S: 
a 
Ox(h) = daz 9 (xta) 


a=0 


with b = D alo 


3a [44] 


a=0 


where (x(o)] is a one-parameter family of paths [8]. 
The Jacobi field technology (the Jacobi operator is 
defined by [103]; a Jacobi field is a solution .of 
[102]) yields the inverse of Wy and its determinants 
(Cartier and DeWitt-Morette 2006); they have been 
worked out for a variety of boundary conditions on 
classical paths. 


Volume Elements Other than Gaussian 


The definition [26|-[27] of Gaussian volume ele- 
ments is a particular case of volume elements on a 
Banach space ® defined by 


£ Daze 6(o.J) = Z(J) 45] 


for y in ®, and J in the dual *' of ^. The volume 
element Do, z is defined by two continuous bounded 
functionals 


0OQ:^x$4'— C and Z:4'—C [46] 


In quantum field theory, y is a field and J is a 
source. The functional Z(J) is then the Schwinger 
generating functional for the z-point functions. An 
axiomatic and applications of functional integrals 
on ® with volume elements Do,7 can be found in 
(Cartier and DeWitt-Morette (1993)). 


Example (Poisson- volume elements) (Cartier and 
DeWitt-Morette (2006) and Collins (1997)). A 
Poisson random variable is a random variable N 
taking values in the set N of non-negative integers 
such that the probability p, that N =n is 


ai PUN = exp(—A) A20 [47] 


Thanks to the normalizing constant exp(— 2A), 
> 9 Pn = 1. The parameter A is the mean value of N: 


(N) =A 48] 


A record of fortuitous events occurring at random 
times to < Tı < T5--- can consist either of the 
number N(t) of events occurring at times less than 
or equal to t, or of the waiting times 


W, = Ty — Tg 1 [49] 
between two consecutive events. 


When the waiting times are stochastically inde- 
pendent and when 


Pr(t < Wg < t + dt) = p,(t) dt [50] 


Palt) =aexp(—at), t30 [51] 


the record is a Poisson random variable. It is related 
to the number of events N(ż) as follows. 
Let T be a finite time interval |r, "|, and 


Nr = N(t") — N(t') [52] 


the number of events during T. The random variable 
Nr follows a Poisson law [47] with mean value 


X(T) — a(t" — i") [53] 


For mutually disjoint time intervals T T9. ... the 
random variables Nn, N445,... are stochastically 
independent. 

Whereas the parameter A must be real non- 
negative, the parameter a can be pure imaginary; 
therefore, Poisson processes defined by waiting 


t (0) T, T2 T4 T4 t 


Figure 3 A Poisson path in X4. 


times can be used in quantum physics as well as in 
probability. When a is real, it is called the decay 
constant because its physical dimension is [time] '. 

A Poisson path x € X, is characterized by n jumps 
and the jump times during a given time interval 
T-—[t,,tj] (Figure 3 illustrates a Poisson path 
in X4). The space X of Poisson paths is the union 
of all X,: 


X —UX, [54] 


One can define a volume element D, 7 on X by its 
Fourier transform: 


I D,Trx-exp(i(x,f)) := exp( f araeo) [55] 


Here a path x € X,, characterized by n jump times 
Ti,..., Tn, is represented by the sum 


OT, + © += + Of, 
Hence 
(x, f) = (Tr) +--+ f(T) [56 
The dimensionless volume element on T is 
dv(t) — a dt 
Therefore, 


vol(T) -aT, T-t-I, 
vol(X,,) = a" T" /n! [57] 
vol(X) = exp(vol(T)) [58] 
and it makes sense to write formally 
A exp Jr 


It can be proved that the volume element D, rx is a 
measure, in the technical sense of the word (Cartier 
and DeWitt-Morette 2006). 
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Functional integration on spaces of Poisson paths 
have been used extensively in solutions of Klein- 
Gordon equations, the telegrapher equation and the 
Dirac equation (Cartier and DeWitt-Morette 2006). 

Other volume elements of interest in quantum 
physics include (LaChapelle 2004): 


e gamma volume elements, which are to gamma 
probability distributions what Gaussian volume 
elements are to Gaussian probability distribu- 
tions; and 

e Hermite volume elements convenient for integrat- 
ing Wick-ordered polynomials. 


A Dirac *ó-function" is formally the limit of a 
Gaussian integral. Formally, one can introduce a Dirac 
functional volume element as the limit of a Gaussian 
volume element. 


The Koszul Formula 


There are several roadblocks on the road from finite 
to infinite-dimensional spaces. For instance, a 
volume in a D-dimensional space is a top-differential 
form, that is, a D-form. There is no top-form in an 
infinite-dimensional space — neither on Grassmann 
manifolds since Grassmann forms are totally sym- 
metric tensors. A D-form in RP has only one strict 
component and is equivalent to a scalar density of 
weight 1, but scalar densities of weight 1 do not 
form an algebra. 

For these reasons, volume elements have so far 
been defined by integrals [26], [27], [43], and [55]. 
Short of giving an explicit expression for their 
differential forms, one can require them to satisfy 
the Koszul formula 


Lxw = Div(X)w [59] 


where w is a volume element on a Banach space X, 
X a vector field generating a group of transforma- 
tions on X, £x the Lie derivative defined by X, and 
Div(X) the standard generalization of div(ergence) 
on finite-dimensional spaces (see, e.g., Cartier and 
DeWitt-Morette (2006) for the explicit expression of 
divergences on Riemannian, symplectic, Grassmann 
manifolds). The Koszul formula dictates how a 
volume element changes under a group of 
transformations. 

It often happens that an object cannot be defined 
per se, but that it is sufficient to define its variation. For 
example, one does not define potentials, but potential 
differences; the ratio of infinite-dimensional determi- 
nants can be defined without defining each determi- 
nant; the work of Wiener on “differential-spaces,” 
which is a landmark in functional integration, is based 
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on differences between two consecutive values of a 
function, etc. Similarly, the Koszul formula does not 
define w but gives its variation Lyw. 


The Operator Formalism of Quantum 
Physics 


Functional integrals can be used to represent 
operator matrix elements, and solutions of the 
Schródinger equation. 


1. Matrix elements of operators on Hilbert spaces. 
Symbolically, 


(Blexp(—iHt/h)|a) = 人 Dxeexp(is(x)/b) [60] 


The domain of integration X, is a space of paths 
x on [t,0] satisfying initial conditions that 
characterize the quantum state o, and final 
conditions that characterize the quantum state 8. 
The action functional S yields the Hamiltonian H. 
A key property of path integrals is their 
representations of matrix elements of time- 
ordered operators. The path parameter (time, 
scale, or any other parameter) provides the 
operator ordering [11]. A simple example is the 
two-point function of the Wiener measure [42]: 


/ dI (x) x(t)x(s) = inf(t, s) [61] 
X 


The function integral orders the time, that is, the 
argument of the variable of integration. In 
quantum field theory, time ordering becomes a 
chronological ordering dictated by light cones. 

2. Schródinger equation and other parabolic equa- 
tions (Cartier and DeWitt-Morette 2006). 


The following theorem provides the mathematical 
underpinning for a great variety of functional inte- 
grals. It also provides a construction of functional 
integrals, which begins with the symmetries of a given 
physical system rather than its action functional. The 
theorem consists of two parts: the definition of a 
functional integral, and the partial differential equa- 
tion satisfied by the value of the functional integral, as 
a function of a set of parameters. 

Given a manifold M, consider the contractible 
space 79M of pointed L^! paths over T = [t,, tp]: 


x: —M, egalt) 3X, 
ie, x € PoM [62] 


Given D+1 vector field Y, {X(q)}, generators of 
group of transformations on M, define a map 


P:PIOR 一 PiM byz—x [63] 


explicitly 


dx(t,z) = X(q)(x(t,z))dz* + Y(x(t,z))dt [64] 


x(t, 2) = Xp, z(ts) = () [65] 


In general, the vector fields do not commute and the 
solution of [64]-[65] is of the form 


XE, 2) = Xp = > z) [66] 


where X(t,z) is an element of a group of right 
actions on M, defined by the D+ 1 generators Y, 
{Xia)}: 


Xh » AT di: X z’) = Xp* » (t,2) ` > 


The path z defined on [t,, t] is followed by the path 
z on [t,?']. 

Consider the following functional integral over 
PoR? of a functional of paths on PM: 


(Ure) (xp) := [ 1 (-= (2) 


x oxi ; > 62) [67] 


where 


Q(z) = [at hope" (t)z"(t) [68] 


The functional (Uro) at x, is a function W(T, xp). It 
is a solution of the generalized Schrédinger equation, 

GM o s. 

aT = =h P s Cx C-ÉyWV [69] 
This equation is valid on manifolds M (e.g., frame 
bundles, U(N) bundles, multiply connected spaces, 
symplectic manifold phase space) in arbitrary sys- 
tems of coordinates. 


Example (Polar coordinates on RP). Let us 
abbreviate z“(t) to z^,x!(t) to r, and x?(t) to 8. It 
follows from 


z! — rcos8, z = rsinó [70] 
that 

dr = cos 0 - dz! + sin 8 - dz? 
: Xiņndz' + X, dz 


ind 0s 0 [71] 
dé -—9—— dg 997 da 


= Xt dz! + Xd 


The dynamical vector fields are, therefore, 


ð sin 0 


X (1) 一 cos —— Bo [72] 
, 4,0 Cos @ 
X (2) = sin 《元 十 E, [73] 
Here h°’ = 6°? and eqn [69] reads 
OV si 10) 108 
eaa uat e) M 


This example is trivial because x(t,z) is not a 
functional of z but a function of z(t) given by [70]. 
In the following example, x(t,z) is a functional of z. 


Example (Paths with values on a Riemannian 
manifold (MP, g)). Consider the frame bundle over 
MP and a connection c defining the horizontal lift 
p(t) of a vector x(t), 


p(t) — o(p(t)) - x(t) [75] 


In order to bring eqn [75] in the form [64], we think 
of a frame u(t) as a linear map from RP into the 
tangent space T44MP : 


u(t): RP > TrM” [76] 
Let 
z(t) := u(t) 'x(t) [77] 


Choose a basis (e4] in RP and {eia} in Tu MP? 
such that 


&(t) = 24 (tera) = (t) (xX^(t)eu)) — [78] 


Insert u(t) o u(t)’ into [75], then 


p(t) = Xy (o(t)) z^ (t) [79] 
where the dynamical vector fields are 
X(a)(p(t)) = (o(p(t)) o u(t)) - eu 80] 


The construction [64]-[69] gives a parabolic equation 
on the bundle. If the connection o is the metric 
connection, then the parabolic equation on the bundle 
gives, by projection on the base space, the parabolic 
equation with the Laplace-Beltrami operator. Expli- 
citly, the projection on the base space of [67] is 


$m) | Ds o(z) exp( Q(z)) 
x $((Dev z)(t;)) [81] 


where Dev is the Cartan development map, namely 
the bijection, defined by [82], from the space of 
pointed paths z on T,M” (identified to RP via the 


a 
S 
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frame ug) into the space of pointed paths x on MP 
(paths such that x(tp) = x;): 


(II o p)(t) =: (Dev z)(1) [82] 


II is the projection on the base space. The path 
integral [81] is the solution of the equations 


artem) = Ale) [83 

V(fa, x) = (x) [84] 

where A is the Laplace-Beltrami operator on (MP, g), 
A = g" DjD; [85] 


and Dj is the covariant derivative defined by the 
Riemann connection o. 


Semiclassical Expansions 


Classical mechanics is a limit of quantum 
mechanics; therefore, it is natural to expand the 
action functional S of a given system around, or 
near, its classical value — namely its minimum S(q), 
where g is a solution of the Euler-Lagrange 
equation, 


S'(q) =0 [86] 
Set 
1 
S(x) —5(q) +S (a) 6 5,8 (a) -& 
1 
FS (q) EESS [87] 
where x € X is a path 
x:T—MP 


and €,7 € T,X is a vector field at q € X. The second 
variation of S is called its Hessian 


S" (q)&r =: Hess(q; £, n) [88] 


The arena of semiclassical expansions of a func- 
tional integral schematically written as 


‘= / Dxexp(iS(x)/b)-ó((x())) 189) 
Xab 


consists of the intersection U,, of two spaces 
Xap CX the space of paths satisfying D initial 
conditions (a) and D final conditions (b), and 
U?P (S) the space of critical points of S 


q E€ UP(S), S(q)-0 [90] 


Usp := Xap U?P(s) [91] 
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D 
Pa, pid Ua b 


X6 N U?P(s) 


Ug, bp: UPIS) N Pa, , M? 


PM? 


Figure 4 Intersection of the space Pa pM? (abbreviated to 
Xa») of paths on MP with fixed points, and the 2D-dimensional 
space U?P (S) of critical points of the system S. (Adapted from a 
Plenum Press publication with permission by Springer-Verlag.) 


The nature of the intersection U,, determines the 
behavior of the system S. Figure 4 shows the 
intersection of the space X, , of paths on MP with 
fixed points. It also shows the space U?P(S) of 
critical points of S. 

We consider first the case in which U, p consists 
of a single point q, or several isolated points q(;. The 
semiclassical expansion consists in dropping the 
terms beyond the Hessian: 


27i T 
lwkB zi DE exp (= (sto + 7° (q) «)) 
X, 
x ó(x(t;)) [22] 
where the initial wave function ó accounts for the D 


initial conditions of the system, and X, is the space 
of pointed paths 


x(t) — xy, and £(tj) 三 0 foreveryx € X, [93] 


WKB Approximations 


The integral lwkp is the Gaussian defined by the 
Hessian. Explicit calculations of lwyg exploit the 
power of Jacobi fields of S at q. 


Example (Momentum-to-position transitions) 
Cartier and DeWitt-Morette 2006). We have 


(e.g., 


xà 
Iw (Xp, tp; Pa, ta) = exp S(a (to), P(ta)) 


a 1/2 
i (ae Oq' om B^ 


where S is the action function (a.k.a. Hamilton's 
principal function) 


S(q(tp), P(ta)) = S(q) + (pa. x(ta)) [25] 


where the classical path 9 is characterized by its 
initial momentum fp, and its final position xj. The 


proof of [94] rests on the following property of 
quadratic forms Q. Let L: X — Y linearly and 


Ox = OvoL [96] 


According to the notations used in [26], [27], 


人 Dx (x) exp(  Ox(x)) sed [97] 


According to [35], [27], 
1 = | Dx(x)exp(—*Ox(x)) 
T 
= | D(L») exp(—~ Qy(Lx)) [28] 


- deu] | Dy(x) exp(——Ox(x)) [99] 


If s— 1, that is, if Ox and Oy are positive definite, 
then 


人 Dy(x) expCrOx(z)) == det(Qx/Qv) "^ [100] 


If s —1, that is, for Feynman integrals 


人 Dy(x) exp(-7Qx(x)) 


- Idet(Qx / Qy)| imd(Qx/Q7) 


where *Ind(Ox/Oy)" is the ratio of the numbers of 
negative eigenvalues of Ox and Oy respectively, and 


i—-v-1-2e/7?, 


Equation [100] is a key equation for semiclassical 
expansions where it is convenient to break up the 
second variation S"(q)££ into two quadratic forms: 


S (q)&€ = OQo(£) + QE) [102] 


where Oo is the kinetic energy. The quadratic form 
Oo is a convenient Gaussian volume element for 
computing [92]. Moreover, splitting. the Hessian 
into Oo + O corresponds to splitting the system into 
a "free" system and a perturbation. 

In eqns [100] and [101] the determinant of the 
ratios of the infinite-dimensional quadratic forms 
Ox/Oy have been shown (Cartier and DeWitt- 
Morette 2006) to be a finite-dimensional determi- 
nant, thanks to Jacobi field technology. 


101] 


Degenerate Hessians; Beyond WKB 


When U, , consists of isolated points, the Hessian is 
not degenerate, and the semiclassical expansion is 
usually called the (strict) WKB approximation. 
When the Hessian is degenerate, 


S'(g&£—0 for £70 [103] 


there is at least one nonzero Jacobi field b along q, 
S"(q)h=0, be T,U*P(S) [104] 


with D vanishing initial conditions (a) and D 
vanishing final conditions (b). Equation [104] is 
the defining equation of Jacobi fields. The vanishing 
boundary conditions imply that b € T, X, 4 as well 
as being a Jacobi field. 

For understanding the intersections U, p when the 
Hessian is degenerate, one can construct the follow- 
ing basis for the intersecting tangent spaces 
T,U*P(S) and T,X, ;: 


e Basis for T,U*?(S): a complete set (if it exists) of 
linearly independent Jacobi fields. It can be 
constructed by varying the 2D conditions (a), (b) 
satisfied by q € X, ,. 

e Basis for T,X, pb: a complete set of orthonormal 
eigenvectors [V,] of the Jacobi operator .7(9) 
defined by the Hessian 


S (q)- & =: ((J(q),£), €) [105] 


J (q)U, = apy, RE (0. 1....] [106] 
The basis {Vi} diagonalizes the Hessian. When 
the Hessian is degenerate, there is at least one 


eigenvector of 7(q) with zero eigenvalue. 


1. The intersection U, ; is of dimension / > 0. Let 
{uf} be the coordinates of € in the {W,} basis of 
T4X, y. Then the diagonalized Hessian is 


S"(q) -£& — V ^ ay (u*)? [107] 


There are / zero eigenvalues [o] when the system of 
Euler-Lagrange equations decouples (possibly after 
a change of variable in X,,) into two sets: l 
constraint equations, and D — / equations determin- 
ing D —1 coordinates [g^] of q. Say [—1, for 
simplicity. Then 


1 
^ gat AN2 
S(x) — S(q) + cou +5) cele!) 


+ O(lu[^) [108] 


where 


iS | 
co = i dre Wt 1109] 


The change of variable € — {uf} is a linear change 
of variable of type [33]. The integral [92] 
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Figure 5 A flow of particles scattered by a repulsive Coulomb 
potential. (Reprinted from Physical Review D with permission by 
the American Physical Society.) 


decomposes into the product of an ordinary integral 
over u and a Gaussian functional integral defined 
by a nondegenerate quadratic form. The integral 
over u yields a Dirac 6-function, ó(co/b). The 
propagator vanishes unless the conservation law 
co — O is satisfied. 

Conservation laws appear in the classical limit of 
quantum physics. The quantum system may have 
less symmetry than its classical limit. 

2. The intersection U, p is a multiple root of the 
Euler-Lagrange equation. The flow of classical 
solutions has an envelope, known as a caustic. 
Caustics abound in physics: the soap bubble 
problem, scattering of particles by a repulsive 
Coulomb potential (see Figure 5), rainbow scatter- 
ing from a source at infinity, glory scattering etc. 
(Cartier and DeWitt-Morette 2006). 

Let us consider a specific example for simplicity. 
For instance, the scattering of particles of given 
momenta p, by a repulsive Coulomb potential. 
Let q and q^ be two solutions of the Euler- 
Lagrange equation with slightly different boundary 
conditions at £j. Compute I (xP, tp; Pa, ta) by expand- 
ing the action functional not around q^ but around q. 
The path 4^ is not in X, p and the expansion of the 
action functional has to be carried up to and including 
the third variation. As before, let {uf} be the 
coordinates of € in the base (V,], k € {0,1...}. The 
integral over u? is an Airy integral 


v Ai (ue) du? exp(i(cw” += (u°)’)) [110] 
JR 


where 


T S 
= f & | de | po. 
ý ai T r Óóq^(r)éq? (s)óq^ (t) 


[111] 
x Vo(r)Wo(s)WQ() 
dx [f 6S 
二 二 天 de Volt) (xp — xp) [112] 
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The leading contribution of the Airy function when 
h tends to zero can be computed by the stationary 
phase method. When x? is in the “illuminated” 
region, the probability amplitude I(x, tyi pas ta) 
oscillates rapidly as h tends to zero. When x? is in 
the *dark" region, the probability amplitude decays 
exponentially. Quantum mechanics softens up the 
caustics. 

The two kinds of degeneracies described in 
sections (1) and (2) may occur simultaneously. This 
happens, for instance, in glory scattering for which 
the cross section, to leading terms in the semiclassi- 
cal expansions, has been obtained by functional 
integration in closed form in terms of Bessel 
functions (Cartier and DeWitt-Morette 2005). 

3. The intersection U, p is the empty set. There is 
no classical solution corresponding to the quantum 
transition. This phenomenon, called *tunneling" or 
"barrier penetration," is a rich chapter of quantum 
physics which can be found in most of the books 
listed under *Further reading." 


A Multipurpose Tool 


Functional integration provides insight and techni- 
ques to quantum physics not available from the 
operator formalism. Just as an example, one can 
quote the section “Beyond WKB” which has often 
been dismissed in the operator formalism by stating 
that *WKB breaks down" in such cases. 

The power of functional integration stems from 
the power of infinite-dimensional spaces. For 
instance, compare the Lagrangian of a system with 
its action functional 


Sig) s | dt L(x(t),x(t)), x€ Xap 


£iT — M°’;— —8:X,, ~R [113] 
A classical solution g of the system can be defined 
either by a solution of the Euler-Lagrange equation, 
together with the boundary conditions dictated by 
q € Xap or by an extremum of the action func- 
tional, S'(q) —0. The path q is a significant point in 
X, but it is not isolated and the Hessian S"(q) 
gives much information on q, such as conservation 
laws, caustics, tunneling. 

A list of applications is beyond the scope of this 
article. We treat only two applications, then give in 
the *Further reading" section a short list of books 
that develop such applications as polarons, phase 
transitions, properties of quantum gases, scattering 
processes, many-body theory of bosons and fer- 
mions, knot invariants, quantum crystals, quantum 
field theory, anomalies, etc. 


The Homotopy Theorem for Paths Taking Their 
Values in a Multiply-Connected Space 


The space X, of paths x 


x:T— MP, xeX, 

probes the global properties of their ranges MP. 
When MP is multiply connected, X, p is the sum of 
distinct homotopy classes of paths. The integral over 
Xa p is a linear combination of integrals over each 
homotopy class of paths. The coefficients of this 
linear combinations are provided by the homotopy 
theorem. 

The principle of superposition of quantum states 
requires the probability amplitude for a given 
transition to be a linear combination of probability 
amplitudes. It follows that the absolute value of the 
probability amplitude for a transition from the state 
a at t, to the state b at t, has the form 


|K(b, ty; a,ta)| = | > , x(a)K°(b, ty3a,ta)| — [114] 


where K* is the interval over paths in the same 
homotopy class. The homotopy theorem (Laidlaw 
and Morette-DeWitt 1971) and (Schulman 1971) in 
Cartier and DeWitt-Morette (2006)) states that the 
set {X(a)} forms a representation of the fundamental 
group of the multiply connected space MP. One 
cannot label a homotopy class by an element of the 
fundamental group unless one has chosen a point 
c € MP and a homotopy class for paths going from 
c to a and for paths going from c to b — in brief, 
unless one has chosen a homotopy mesh on MP. 
The fundamental group based at c is isomorphic to 
the fundamental group based at any other point of 
MP but not canonically so. Therefore, eqn [114] is 
only an equality between absolute values of prob- 
ability amplitudes. The proof of the homotopy 
theorem consists in requiring [114] to be indepen- 
dent of the chosen homotopy mesh. 


Application: Systems of n-Indistinguishable 
Particles in R? 


In order that there be a one-to-one correspondence 
between the system and its configuration space, 


x: T —> RPS 75. — RDP” 


where $, is the symmetric group for n permutations; 
the coincidence points in R”” are excluded so that 
S, acts effectively on RP*", Note that R^" is not 
connected, but R^" is multiply connected. When 
D > 3, RP" is simply connected and the fundamen- 
tal group on RP" is isomorphic to Sp. 


There are only two scalar unitary representations 
of S,,: 


y :a€S, 31 


v iaes,-. 


for all permutations a 
1 for even permutations 
—1 for odd permutations 


Therefore, in R? there are two different propagators 
of indistinguishable particles: 


K bose s » x’ (o)K? [115] 
Is a symmetric propagator 
[116] 


Kiem die >. y'(a)K® 


is an antisymmetric propagator. 

The arguments leading to the existence of (scalar) 
bosons and fermions in R? fails in R*. Statistics 
cannot be assigned to particles in R?; particles 
“without” statistics have been called anyons. 


Application: a Spinning Top 


Schulman's analysis of the Schrédinger equation for 
a spinning top (Schulmann 1968) motivated the 
formulation of the homotopy theorem. Therefore, 
Schulman’s results can easily be formulated as an 
application of [114]. 


Application: Instantons (DeWitt 2004) 

The homotopy theorem reformulated for functional 
integrals applies to the total (outlin) amplitude of 
instantons in Minkowski spacetime. 

Scaling Properties of Gaussians 

We rewrite the definition [26] of Gaussian volume 
elements as 


area exp(—27i(x',x)):— exp(—7ziW(x')) [117] 


where the covariance G is defined by the variance W, 
W (x) = (x', Gx’) 


In quantum field theory the definition [26] reads 


f dre(p) exp(—2ni(J,¢)) := exp(-ziW(J)). [118] 


where is a field on spacetime (Minkowski, or 
Euclidean) and J is called the source. A Gaussian Tc 
can be decomposed into the convolution of any 
number of Gaussians. For example, if 


W = Wi + W2 — G = Gi + G: [119] 
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then 
lG = L'c, * LlG [120] 
Explicitly, in QFT 
| dT'c() exp(—27i(], 9)) 
= | Telo) J drg, (p1) 
x exp(—2ri(J, pı + 2)) [121] 
where 
p = pı + p2 [122] 


The additive property [119] makes it possible to 
express a covariance G as an integral over an 
independent scale variable. 

Let A € [0,06] be an independent scale variable. 
(some authors use A € [1,oc[ and A^ e [0, 1[). A 


scale variable has no physical dimension: 
[A] 2 0 [123] 


The scaling operator S) acting on a function f of 
length dimension [f] is by definition 


Syf (x) := NAF (x/N) 


the scaling of an interval 
Sala, b| = {s/Als € [a, b[}, that is, 


[124] 
[a,b| is given by 


Sla, b| = [a/ A, b/ A| [125] 
The scaling of a functional F is 
(S\F)(~) = F(Say) [126] 


In order to decompose a covariance into an integral 
of scale-dependent contributions we note that a 
covariance G is a two-point function [31]. In 
quantum field theory [118], the engineering length 
dimension of G is twice the field dimension 


iG] = 21g 


Let x,y € spacetime and G be a Laplacian Green 
function. One can introduce a scaled (truncated) 
Green function 


[127] 


Gy, i (x, y) := [asin — yl [128] 
where 
UI = 1, si=1, d's-ds/s 
bet [f 129] 
such that 
lim Guaj% y) = G(x, y) [130] 


Ig5=0.,l=00 
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Example G(x,y)—cp/|x — y| ^; then the only 
requirement on the function u in [128] is 


a d*rr 9r =cp, [rf =1 [131] 
0 


All objects defined by the scaled covariance [128] 
are labeled with the interval [/o,/[. For instance, 
a Gaussian volume element Tc is abbreviated 
to ln 


A Coarse-Graining Operator 


The following coarse-graining operator has been 
used for constructing a parabolic semigroup equa- 
tion in the scaling variable (Brydges et al. 1998): 


PF :一 Si ‘ Lo * F [132] 


where the convolution product is by definition 


(Tua * F)(y) = J dpal Flo + y) 


The coarse-graining operator P; rescales the con- 
volution of a Gaussian volume element Tl, 1 so that 
all volume elements entering the construction of the 
semigroup renormalization equation are scale 
independent. 

Some properties of the coarse-graining operator: 


® PLP = Pus. 
e The scaled eigenfunctions of the coarse-graining 


operator are Wick-ordered monomials (Wurm 
and Berg 2002) 


B lo 
Pj ; yp" (x) : [lo,oo| = (z) s qo" (s) ` [Ip 00 [133] 


Note that P, preserves the scale range. 
e Let H be the generator of the coarse-graining 
operator 


[s [134] 


于“ 
" pa Ol al 


= at! 


The semigroup renormalization equation (a.k.a. 
the flow equation) 


à* Se uas: 
Ft) = HPjF(o) 


Pp F(q) — F(q) 


Brydges et al. have applied the coarse-graining 
operator to the quantum field theory known as 
“Apt” (more precisely the Wick-ordered Lagrangian 
of A^). The flow equation [135] plays the role of 
the *8-function" equation in perturbative quantum 
field theory. 


[135] 


Functional Integrals in Quantum Field Theory 


Functional integrals in quantum field theory have 
been modeled to some extent on path integrals in 
quantum mechanics: mutatis mutandis, the defini- 
tion [23] of Gaussian volume elements, the diagram 
expansion [30], the property [36] of linear maps, 
semiclassical expansions [87], the homotopy theo- 
rem [114], and the scaling eqns [135] apply to 
functional integrals in quantum field theory. The 
time ordering encoded in a path integral becomes a 
chronological ordering dictated by light cones in 
functional integrals of fields on Minkowski fields. 

The fundamental difference between quantum 
mechanics (systems with a finite number of degrees 
of freedom) and quantum field theory (systems with 
an infinite number of degrees of freedom) can be 
said to be “radiative corrections.” In quantum field 
theory, the concept of “particle” is intrinsically 
associated to the concept of “field.” A particle is 
affected by its field. Its mass and charge are 
modified by the surrounding fields, namely its own 
and other fields interacting with it. One speaks of 
“bare mass" and “renormalized mass" when the 
bare mass is renormalized by surrounding fields. 
Computing radiative corrections is a delicate proce- 
dure because the Green functions G defined by [25] 
are singular. Regularization techniques have been 
developed for handling singular Green functions. 

Particles in quantum mechanics are simply particles, 
and bosons and fermions can be treated separately. 
Not so in quantum field theory. Therefore, the 
configuration space in quantum field theory is a 
supermanifold. For functional integrals in this theory, 
we refer the reader to the “Further reading” section, in 
particular to the book of A Das for an introduction, to 
the book of B DeWitt for an in-depth study, and to the 
book of K Fujikawa and H Suzuki for applications to 
quantum anomalies. 


Concluding Remarks 


The key issue in functional integration is the domain 
of integration, that is, a function space. This infinite- 
dimensional space, say X, cannot be considered as 
the limit n= oo of R”. 

Concepts of RP stated without reference to D are 
likely to be meaningful on X. Other approaches 
which have been used for exploring X are 


® projective system of finite-dimensional spaces 
coherently defined on X (DeWitt-Morette et al. 
1979), 

è one-parameter curves on X (Figure 1), and 

* projecting X on finite-dimensional spaces (cylind- 
rical integrals). 


Functional integration has advanced our under- 
standing of infinite-dimensional spaces, and like all 
good mathematical tools, it improves with usage. 


See also: BRST Quantization; Euclidean Field Theory; 
Feynman Path Integrals; Infinite-Dimensional 
Hamiltonian Systems; Knot Theory and Physics; 
Malliavin Calculus; Path Integrals in Noncommutative 
Geometry; Quantum Mechanics: Foundations; Stationary 
Phase Approximation; Topological Sigma Models. 
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Introduction 


Several asymptotic problems in the calculus of 
variations lead to the following question: given a 
sequence F, of functionals, defined on a suitable 
function space, does there exist a functional F such 
that the solutions of the minimum problems for F, 
converge to the solutions of the corresponding 
minimum problems for F? l'-convergence, introduced 
by Ennio De Giorgi and his collaborators in 1975, and 
developed as a powerful tool to attack a wide range of 
applied problems, provides a unified answer to this 
kind of question. 


Definition and Main Properties 


Let U be a topological space with a countable 
base and let F, be a sequence of functions defined 
on U with values in the extended real line 
R :— RU (—oo, +00}. We say that F, l'-converges 
to a function F:U — R, or that F is the T-limit of 
Fp, if for every u € U the following conditions are 
satisfied: 


1. For every sequence u, converging to u in U we 
have 


F(u) < lim inf Fu) 


2. There exists a sequence u, converging to u in U 
such that 


F(u) = jim FE (Uk) 


Property (1) appears to be a variant of the usual 
definition of lower semicontinuity. Property (2) 
requires the existence, for every u € U, of a “recovery 
sequence,” which provides an approximation of the 
value of F at u by means of values attained by F, 
near u. 


I-Convergence and Homogenization 


It follows immediately from the definition that, 
if F, T-converges to F, then F} + G F-converges to 
F +G for every continuous function G:U 一 及. 

The first.general property of L-limits is lower 
semicontinuity: if F, l'-converges to F, then F is 
lower semicontinuous on (4; that is, 


F(u) < lim inf F (ug) 


for every u € U and for every sequence u, converg- 
ing to u in U. 

Another important property of l-convergence is 
compactness: every sequence F, has a l'-convergent 
subsequence. 

For every k assume that the function 7, has a 
minimum point u,. The following property is the 
link between 工 -convergence and convergence of 
minimizers: if F, Dl-converges to F and ug con- 
verges to u, then u is a minimum point of F and 
Flup) converges to F(u), hence 

min F (v) = lim min Fg(v) 四 

Under suitable coerciveness assumptions, the 
convergence of up is obtained by a compactness 
argument. We recall that a sequence of functions F, 
is said to be equicoercive if for every ¢ € R there 
exists a compact set K; (independent of k) such that 


{ucUu:F,(u) <t} C Ki [2] 


for every k. 

If F, 1s equicoercive and [-converges to F, the 
previous result implies that [1] holds. If, in addition, 
F is not identically +00, then the sequence up of 
minimizers considered above has a subsequence ux 
which converges to a minimizer u of F. The whole 
sequence u, converges to u whenever F has a unique 
minimizer z. 

In many applications to the calculus of variations, 
U is the Lebesgue space LP(Q;R"), with Q a 
bounded open subset of R” and 1 € p <-+oo, but 
the effective domains of the functionals F, defined 
as {u EU: Fr(u) €R}, are often contained in the 
Sobolev space W'^^(Q; R”), composed of all func- 
tions 4 € L"(Q; R”) whose distributional gradient 
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Vu belongs to L^(Q; R"*"), When one considers 
homogeneous Dirichlet boundary conditions, the 
effective domains of the functionals F, are often 
contained in the smaller Sobolev space Wo* (Q; R"'), 
composed of all functions of W!?(Q;R”) which 
vanish on the boundary OQ, technically defined as 
the closure of Ce (Q0; R”) in W^ (Q; R”). 

In this case, the equicoerciveness condition [2] can 
be obtained by using Rellich's theorem, which 
asserts that the natural embedding of W, E (Q; R”) 
into L? (Q; R”) is compact. Therefore, a sequence of 
functionals F, defined on L^(Q; R") is equicoercive 
if there exists a constant a > 0 such that 


Flu) > a | [Vul? dx 
Q 


for every u € wy? (Q; R”), while £,(u)-— +00 for 
every u ¢ W, ?(Q: R"). 


Homogenization Problems 


Many problems for composite materials (fibered or 
stratified materials, porous media, materials with 
many small holes or fissures, etc.) lead to the study 
of mathematical models with many interacting scales, 
which may differ by several orders of magnitude. 
From a microscopic viewpoint, the systems considered 
are highly inhomogeneous. Typically, in such com- 
posite materials, the physical parameters (such as 
electric and thermal conductivity, elasticity coeffi- 
cients, etc.) are discontinuous and oscillate between 
the different values characterizing each component. 

When these components are intimately mixed, 
these parameters oscillate very rapidly and the 
microscopic structure becomes more and more com- 
plex. On the other hand, the material becomes quite 
simple from a macroscopic point of view, and it tends 
to behave like an ideal homogeneous material, called 
“homogenized material.” The purpose of the mathe- 
matical theory of homogenization is to describe this 
limit process when the parameters which describe the 
fineness of the microscopic structure tend to zero. 

Homogenization problems are often treated by 
studying the partial differential equations that 
govern the physical properties under investigation. 
Due to the small scale of the microscopic structure, 
these equations contain some small parameters. The 
mathematical problem consists then in the study of 
the limit of the solutions of these equations when the 
parameters tend to zero. D-convergence is a very 
useful tool to obtain homogenization results for 
systems governed by variational principles, which 
are the only ones described in this article. 


Let Q :=(— 1/2, 1/2)” be the open unit cube in R” 
centered at 0. We say that a function u defined on R” is 
O-periodic if, for every z € R” with integer coordi- 
nates, we have u(x + z) ^ u(x) for every x € R”. 

Let f: R" x R"*" — [0,+00) be a function such 
that x — f(x,£) is measurable and O-periodic on R” 
for every £ € R"*" and £ — f(x, £) is convex on R"*" 
for every x € R”. Given a bounded open set 2 C R” 
and a constant p > 1, let F+: L^(Q; R") — [0, --oc] 
be the family of functionals defined by 


£u) = lG! (x/e,Vu)dx if u € Wo” (Q; R") 


+00 otherwise 


In the applications to composite materials, the func- 
tional F- represents the energy of the portion of the 
material occupying the domain 2. The fact that the 
energy density depends on x/e reflects the &-periodic 
structure of the material, which implies that the energy 
density oscillates faster and faster as € — 0. 

Assume that there exist two constants 3 > a> 0 
such that 


alél < f(x, €) < B(1 + JE") i3] 


for every x € 2 and every E € R”*”. Then for every 
sequence € — 0 the functionals F., l'-converge to 
the functional Fhom : LP (9; R”) — [0, 二 co] defined by 


; 1. - 
F hom (44) = u Eo " if u € Wo P(Q; R ) 
+00 otherwise 
[4] 


The integrand fhom : R"*" — [0, +00) is obtained by 


solving the cell problem 


[ feng + vu) dx [5] 


Tham (6) me min 
we Wy P(Q;R") 
where Wer (Q; R") denotes the space of functions 
we W, ef | R”) which are Q-periodic. 

The an fhom iS always convex and satisfies 
[3]. If it is strictly convex, the basic properties of 
L-convergence imply that for every g € L?(Q; R”), 
with 1/p + 1/q — 1, the solutions u: of the minimum 
problems 


min | f(=, vv) -- g(x)v| dx [6] 


ve W,"(Q;R") JO 


converge in L? (Q; R”), as < — 0, to the solution u of 
the minimum problem 


min 人 [/hom (Vv) — 


ve Wi’ (Q;R") 


g(x)v| dx 7] 


Similar results can be proved for nonhomogeneous 
Dirichlet boundary conditions, as well as for 
Neumann boundary conditions. 

In the special case m=1, p=2, and 


f (x, €) x ajj(x)6; & [8] 


2 


with aj(x) O-periodic, the function fhom takes the 


form 


from (€) 3; 3 ane 6i 


for suitable constant coefficients grom 


By considering the Euler equations of the prob- 
lems [6] and [7] in this special case, from the 
previous result we obtain the homogenization 
theorem for symmetric elliptic operators in diver- 
gence form, which asserts that for every g € L^(Q) 
the solutions z. of the Dirichlet problems 


E > Di (ai (=) Dju.(x)) =g(x) on? 


u(x) = 0 on OQ 


converge in L^(Q) to the solution u of the Dirichlet 
problem 


- Ma DD on Q 


= g(x) 


az) = () on OX) 


An extensive literature is devoted to precise 
estimates of the homogenized coefficients am, 
depending on various structure conditions on the 
periodic coefficients aj(x). Some of these esti- 
mates are based on a clever use of the variational 
formula [5]. 

Explicit formulas for apom are known in the case 
of layered materials, which correspond to the case 
where R" is periodically partitioned into parallel 
layers on which the coefficients aj(x) take constant 
values. 

Easy examples show that, even if the composite 
material is isotropic at a microscopic layer (i.e., 
aj(x)-—a(x)óg for some scalar function a(x)), the 
homogenized material can be anisotropic (i.e., 

ann Æ a6;), due to the anisotropy of the pexiodic 

Mas a(x) which describes the microscopic 
distribution of the different components of the 
composite material. 

In the vector case m > 1, the convexity hypothesis 
on £ — f (x,£) is not satisfied by the most interesting 
functionals related to nonlinear elasticity. If £ — f(x, £) 
is not convex, one can still prove that F+, l'-converges 
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to a functional Fim: L^(0; R”) — [0, +00] of the 
form [4], but this time fiom: R"^" — [0, +00) cannot 
be obtained by solving a problem in the unit cell. 
Instead, it is m by the asymptotic formula 


fhom(£) := lim min f(x, € + Vw) dx 


n 4 
jim g weW,"( (Or;:R”) JOn 


where Or :=(—R/2,R/2)” is the open cube of side 
R centered at 0. Similar formulas can be obtained 
for quasiperiodic integrands f and for stochastic 
homogenization problems. 

In the nonperiodic case one can prove that, if 
g-: R” x R"*" — [0, --oc) are arbitrary Borel func- 
tions satisfying [3], with constants independent of e, 
and G.: L? (Q; R”) — [0, --oo] are defined by 


g.(u) := plone Vu)dx ifue wi? (Q; R”) 


+00 otherwise 


then there exists a sequence ep — 0 such that the 
functionals G.,l-converge to a functional G of the 
form 


G(u) := (h g(x, Vu)dx if u € W}? (Q; R”) 


+00 otherwise 


with g satisfying [3]. 

In this case, no easy formula provides the integrand 
g(x, €) in terms of simple operations on the integrands 
g-,(x,€). The indirect connection between these 
integrands can be obtained by introducing the 
functions M.(x,£,p) defined, for x € Q, £ € R"*", 
and 0 < p < dist(x, OQ), by 


M.-(x,€,p):= min f ge(y, 6 + Vw) dy 
we W, (B(x,p)) J B(x,p) 


where B(x, p) is the open ball with center x and radius 
p. These functions describe the local behavior of the 
integrands g. in some special minimum problems. The 
sequence G+, l'-converges to if and only if 


M., (x, £, p) 

x = lim int lininf p AM 
gos e) nins TBC p) 

M., (x, €, p) 

= lim sup lim sup ————-—— 
0 hor Biop) 


for almost every x € Q and every 上 ER 
Similar results have also been proved for integral 
functionals of the form 


6,0 = norm if u € Wi? (Q; R”) 
+00 otherwise 


under suitable structure conditions for the inte- 
grands g.. 
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Perforated Domains 


In some homogenization problems, the integrand is 
fixed, but the domain depends on a small parameter < 
and its boundary becomes more and more fragmented 
as € — 0. A typical example is given by periodically 
perforated domains with small holes. Given a 
bounded open set Q c R” and a compact set K C OQ, 
both with smooth boundaries, for every £ > 0 we 
consider the perforated sets 


N. := QN (J (ez + eK) [9] 


ZE Z^ 


where Zę is the set of vectors z € R” with integer 
coordinates such that ez -- «O CQ. 

Given g € L^(Q), let F+: L (Q) — [0, +00] be the 
functionals defined by 


2 = 2 
PTET [h [Vu]? -guļ dx if we W7(Q) 
ES otherwise 


[10] 


Minimizing [10] is equivalent to solving the mixed 
problems 


—Au; = g on Q; 
4.—0 ondQ [11] 
Oue — 
cae 0 on OQ. V OQ 


The homogenization formula [5] is still valid, with 
minor modifications. It leads to a matrix of 
coefficients apm such that 


| | 十 Vwi dx 
O\K 


n 
> 66 = min 
ij=1 Wwe Wher (Q) 
for every € € R”. For every sequence e, — 0 the T-limit 
of the functionals F., is the functional F : L?(Q) 一 
[0, +00] defined by 


1 n 
Jo 3 3 a" Diu Diu — meu dx 
ij-1 


F(u) := 
" if u € W3^(Q) 


十 oo otherwise 


where m:=|QO\K| is the volume fraction of the 
sets ().. 

Since a slight modification of the functionals 7. 
satisfies an equicoerciveness condition, it follows 
from the basic properties of l'-convergence that the 
solutions 4. of the mixed problems [11] in the 
perforated domains [9], extended to the holes so 
that u- are harmonic on QVO. and u- € W4"(Q), 


converge in L^(Q) to the solution z of the Dirichlet 
problem 


ij 


— ` a°™D ;Dju = mg onQ 
ij-1 


u-—0 on OQ 
Therefore, the asymptotic effect of the small holes 
with Neumann boundary condition is a change in 
the coefficients of the elliptic equation. 

In the case of Dirichlet boundary conditions, it is 
interesting to consider perforated domains with 
holes of a different size, namely 


Qe := [es ev tH RO) [12] 


ZE 


with sz/ oz) replaced by exp(—1/e*) if n=2, while 
the case n= 1 gives only trivial results. 

Given g € L?(Q), let G- : L^(Q) — [0, --oc] be the 
functionals defined by 


" DE 2 。 1,2 
Bd fs IV gu| dx ifue Wy" (Qe) 13] 
+00 otherwise 


Minimizing [13] is equivalent to solving the Dirichlet 
problems 


on £}. 


—Au; = g 
on Of). 


“= 0 


[14] 


For every sequence ep — 0 the T-limit of the 
functionals G., is the functional G: L? (Q) 一 [0, +00] 
defined by 


G(u) := | Jo $v LE -gu| dx ifue Wy" (Q) 


+00 otherwise 


where, for n> 3, 


c= dap(K}.:= Wo 人 [Viwl dx 


w=] on K 


Since a slight modification of the functionals G. 
satisfies an equicoerciveness condition, it follows 
from the basic properties of -convergence that the 
solutions u: of the Dirichlet problems [14] in the 
perforated domains [12], extended as zero on 
Q \ Q., converge in L^(Q) to the solution u if the 
Dirichlet problem 


—Au+cu=g on? 15] 
u= 0 on OQ 


In the electrostatic interpretation of these problems, 
the boundary ðN- is a conductor kept at potential 


zero. The extra term cu in [15] is due to the electric 
charges induced on 9€). by the charge distribution g. 

These results on Dirichlet and Neumann boundary 
conditions have been extended to more general 
functionals and also to a wide class of nonperiodic 
distributions of small holes. 


Dimension Reduction Problems 


In the study of thin elastic structures, like plates, 
membranes, rods, and strings, it is customary to 
approximate the mechanical behavior of a thin three- 
dimensional body by an effective theory for two- or 
one-dimensional elastic bodies. l'-convergence provides 
a useful tool for a rigorous deduction of the lower- 
dimensional theory. 

Let us focus on the derivation of plate theory from 
three-dimensional finite elasticity. The reference 
configuration of the thin three-dimensional elastic 
body is a cylinder of the form 


Me ix (- B z) 


where £ > 0 and S is a bounded open subset of R? 
with smooth boundary. We assume that the body is 
hyperelastic, with stored elastic energy 


人 W(Vu) dx 


where 1:0. — R? is the deformation. The energy 
density W:R?™ —[0, +00], depending on the 
material, is continuous and frame indifferent; that 
is, W(OF)= W(F) for every rotation O and every 
F € R^, where OF denotes the usual product of 
3x3 matrices. We assume that W vanishes on the 
set SO(3) of rotations, is of class C? in a 
neighborhood SO(3), and satisfies the inequality 


W(F) > o dist^(F, SO(3)) for every Fe R?? [16] 


with a constant a > 0. 

Plate theory is obtained in the limit as € — 0 when 
the densities of the volume forces applied to the 
body have the form e?f(x1, x2), with f € L?(S;R°). 
We assume that f is balanced; that is, 


| fax =o, | s^fáx-o 
Q. 0. 


Stable equilibria are then obtained by minimizing 
the functionals 


[ [W (Vu) —e7f -u] dx [17] 


on W!2(Q.; R?). 
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To study the behavior of [17] as & — 0, it is 
convenient to change variables, so that the scaled 
deformations v(x1,X2,X3) := u(x1, X2, €X3) are 
defined on the same domain 


1 1 
0 -$x(-5.5) 


The scaled energy density W. : R^? — [0, --oc] is 
then defined as 


1 
W.(Fi|F;|F;) := w(FilFal =F) 


where (Fı|F2|F3) denotes the 3x3 matrix with 
columns Fi, F2, and F3. This implies that 


| [wv -èf u] ds 
- Np ra 
=e [ww e^ f -v| dx 


The asymptotic behavior of the minimizers of 
these functionals can be obtained from the knowl- 
edge of the  LI-limit of the functionals 
£.:L?(Q0; R?) — [0, +00] defined by 


1 
F(v) := fe i Jide 
Too 


Let us fix a sequence ej, — 0. The 工 -limit of Fa, 
turns out to be finite on the set X(S;R?) of all 
isometric embeddings of S into R? of class W*?; that 
is, v € X(S; R^) if and only if v € W??(S; R^) and 
(Vv)! Vv — I a.e. on S. The elements of X(S; R?) will 
be often regarded as maps from Q into R, 
independent of x3. 

To describe the [-limit, we introduce the quad- 
ratic form Q; defined on R?'? by 


if v c W'?(Q; R?) 


otherwise 


Q3(F) := 5 D W(D[F, F| 


which is the density of the linearized energy for the 
three-dimensional problem, and the quadratic form Q5 
defined on the space of symmetric 2 x 2 matrices by 


ai, 412 
Q2 
d12 a22 


= " min - Os| an an b 
,02,03)€ 
Miror: bı b b 


The I-limit of F., is the functional F:L? 
(Q; R^) — [0, +00] defined by 


F(v) <= [5o OQ»(A) dx ifve X(S; R?) 


-Foo otherwise 


411 412 bi 
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where A(x;,x2) denotes the second fundamental 
form of v; that is, 


Aj :一 —D;,Dyv d [1 8] 


with normal vector v := D4v A Dav. 

The equicoerciveness of the functionals F- in 
L? (Q; R?) is not trivial for this problem: it follows 
from [16] through a very deep geometric rigidity 
estimate which generalizes Korn's inequality 
(see Friesecke et al. (2002)). The basic properties of 
l-convergence imply that 


min l. [W(Vu) — ef :uldx 


uc WV2 (Q,;R?) 


=e? min $ 59:4 一 三 | dx’ + o(e°) 
veX(S;R?) Js |12 
with x' :— (x1,x2) and A defined by [18]. 

For every £ > 0 let 4. be a minimizer of [17] and let 
U-(X1,X2, X3) := uz(x1,X2,€x3). Then the basic proper- 
ties of l'-convergence imply that there exists a sequence 
Ek 一 0 such that v., (x1, x2, x3) converges in L? (Q; R?) 
to a solution v(x1, x2) of the minimum problem 


min / 5 2214) 一 三. | dx’ [19] 
veX(S;R?) JS 12 
These results provide a sound mathematical justification 
of the reduced two-dimensional theory of plates based 
on the minimum problem [19]. 

Similar results have been proved for shells, 
membranes, rods, and strings. 


Phase Transition Problems 


The Cahn-Hilliard gradient theory of phase transi- 
tions deals with a fluid with mass m, under 
isothermal conditions, confined in a bounded open 
subset €) of R" with smooth boundary, whose Gibbs 
free energy, per unit volume, is a prescribed function 
W of the density distribution “u. Given a small 
parameter £ > 0, the energy functional F.: L! (Q) 一 
[0, --oo] has the form 


F.(u):= ty | Wu) +27|Vul’] dx ifu € A(m 
+00 


where A(m) is the set of all functions u € W'^?(Q) 
with fp u=m. 

We assume that W:R — [0,4- oc) is continuous 
and that there exist a, 3 € R, with a|O| < m < 8|], 
such that W(t)=0 if and only t=a or t=. 
Moreover, we assume that W(t) — +00 as t ^ «oo. 
In the minimization of F., the Gibbs free energy 
W (u) favors the functions whose values are close to a 


) (20 


otherwise 


and 8, which represent the pure phases, while the 
gradient term penalizes the transitions between 
different phases. 

It is easy to see that for every sequence 2, — 0 
the sequence F, I-converges to the functional 
F :L'(Q) 5 [0, +00] defined by 


Ja W(u) dx 
+00 


F(u) d if [jou =m 
otherwise 

The set M(a,3,m) of minimum points of F is 

composed of all measurable functions z on Q which 

take only the values a and 58 (on E, and Eg, 

respectively), and satisfy the mass constraint a|E, | + 

B|Eg| =m, which is equivalent to 


B|9| — m 
B-—a 


From the basic properties of l-convergence, we 
deduce that 


|Eal EZ [21] 


min f b40 + evur dx — 0 [22] 
ucA(m) Jo 
and that there exists a sequence e, — 0 such that the 
minimizers “a of F., converge in L'(Q) to a 
function 4 which takes only the values o and 8 
and satisfies [21]. 

This result can be improved by considering the 
rescaled functionals 


G.(u) := F(u) [23] 


where F. is defined by [20]. Then for every 
sequence £, — 0 the sequence G., I-converges to 
the functional G: L'(Q) — [0, --oo] defined by 


G(u) := [MN if u € M(a, B, m) 


+00 otherwise 
where 
Jg 
cm | V W(t) dt 
and 
P(E, 1) 


= spd / div y dx : e € C! (Q; R^), || < i} 
E 


is the Caccioppoli-De Giorgi perimeter of E in Q, 
which coincides with the (n — 1)-dimensional mea- 
sure of QNE when E is smooth enough. 

Note that the effective domain A(m) of the 
functionals G. is disjoint from the effective domain 
of the limit functional G, which is the set of all 
functions u € M(a, B,m) with P(E,,Q) < +oo. 


As the functionals [20] and [23] have the same 
minimizers, we deduce that there exists a sequence 
Ek — 0 such that the minimizers ue, of F., converge in 
L'(Q) to a function u which takes only the values 
a and f, satisfies [21], and fulfills the minimal 
interface criterion 


P(E,,Q) < P(E,Q) 


for every measurable set E C € with |E|=|E,|. 
Moreover, [22] can be improved, and we obtain 
min F.(u) = e2cP(E,,Q) + o(£) 
uc W!2(Q) 
Similar results have been proved when the term 
IVu|^ in [20] is replaced by a general quadratic form 


like [8], which leads to an anisotropic notion of 
perimeter. 


Free-Discontinuity Problems 


Free-discontinuity problems are minimum problems 
for functionals composed of two terms of different 
nature: a bulk energy, typically given by a volume 
integral depending on the gradient of an unknown 
function 4; and a surface energy, given by an 
integral on the unknown discontinuity surface of z. 
These problems arise in many different fields of 
science and technology, such as liquid crystals, 
fracture mechanics, and computer vision. 

The prototype of free-discontinuity problems is 
the minimum problem proposed by David Mumford 
and Jayant Shah: 


min v [vul dx +H" (K n. Q) 
JO\K 


(u,K)c.A 
" | T eax} 24] 
JO\K 


where Q is a bounded open subset of R”, Tt 一 
denotes the (n — 1)-dimensional Hausdorff measure, 
g € L*(Q), and A is the set of all pairs (u, K) with K 
compact, K c R”, and u € C! (QV K). 

In the applications to image segmentation problems 
the dimension 7 is 2 and the function g represents the 
grey level of an image. Given a solution (u, K) of the 
minimum problem [24], the set K is interpreted as 
the set of the relevant boundaries of the objects in 
the image, while u provides a smoothed version of the 
image. The first term in [24] has a regularizing effect, 
the purpose of the second term is to avoid over- 
segmentation, while the last term, called *fidelity term," 
forces u to be close to g. Of course, in the applications 
these terms are multiplied by different coefficients, 
whose relative values are very important for image 
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segmentation problems, since they determine the 
strength of the effect of each term. However, the 
mathematical analysis of the problem can be easily 
reduced to the case where all coefficients are equal to 1. 

To solve [24], it is convenient to introduce a weak 
formulation of the problem based on the space 
GSBV(O) of generalized special functions with 
bounded variation (see Ambrosio et al. (2000)). 
Without entering into details, here it is enough to 
say that every u € GSBV(O) has, at almost every 
point, an approximate gradient Vz in the sense of 
geometric measure theory. This is a measurable map 
from Q into R” which coincides with the usual 
gradient in the sense of distributions on every open 
subset U of Q such that u € W!!(U). 

The functional F: L'(Q)— [0,-4-oo] used for the 
weak formulation of [24] is defined by 


Flu) = | JolVultdx+H""(J,) ifu € GSBV(Q) 
+00 otherwise 
[25] 


where /,, is the jump set of u, defined in a measure- 
theoretical way as the set of points x € Q such that 


1 
lim su Eua. u(y) — a| dy > 0 
p—0 " |B(x, p) B(x.p) | o) dy 
for every a c R. 

For every g € L?*(Q), the functional 


F(u) + | lu — g|^ dx 
Q 


is lower semicontinuous and coercive on L!(Q); 
therefore, the minimum problem 


: 2 
min {Fu + f — dx| 26 
has a solution. The connection with the Mumford- 
Shah problem is given by the following regularity 
result, proved by Ennio De Giorgi and his colla- 
borators: if 4 is a solution of [26] and J, is its 
closure, then HHA NA Uaa) 50,4 € CHON Jy), 
and (u,J,,) is a solution of [24]. 

Since the numerical treatment of [24] and [26] is 
quite difficult, [-convergence has been used to 
approximate [26] by means of minimum problems 
for integral functionals, whose minimizers can be 
obtained by standard numerical techniques. 


Let us consider the nonlocal functionals 
F.:L(Q) 5 [0, +00] defined by 
= fr Av(|Vul?,x e) dx if u e W!2(Q) 
F(u):= Ve Q des 
+00 otherwise 
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where 


Av(|Vul?,x,€) 
1 2 
eT Vu(y)| dy 
|B(x, E) 1 Q| B(x,e)NQ | | 

and f : [0, 4-oc) — [0, +00) 1s any increasing continu- 
ous function with f(0) —0,/'(0) — 1, and f(t) 1/2 
as t—++oo. Then for every sequence ¢,—0 the 
sequence F., l'-converges to F. 

Given g € L*(Q), for every € » 0 let u be a 
solution of the minimum problem 


1 2 
cain, E | (eAvIva a8) dx 


+ | lu - gids | 
0 


From the basic properties of I-convergence it 
follows that there exists a sequence ep—0 such 
that ue, converges in L'(Q) to a solution u of [26], 
so that (4,/,) is a solution of [24]. 

Other approximations by nonlocal functionals use 
finite differences instead of averages of gradients. 

A different approximation can be obtained by 
using the local functionals G- : (L'(Q))* = [0, +00] 
defined by 


à © i. 4 
f Ouve +520) | ds 


Gin v) i= 
(mv) if (u,v) € (w'2(Q)) 
十 co otherwise 
where g,(t):—7.-ct^,0«m] ««e, and b(t):= 


(1—2) for 0 € t € 1, while h(t) :—--oo otherwise. Let 
G:(L'(Q))* 5 [0, +00] be the functional defined by 


O(t, $9). $= [en if v = 1 a.e. on Q 


+00 otherwise 


where F is defined [25]. Then for every sequence 
Ek — 0 the sequence G., P-converges to G. 

Given g € L™(Q), for every £ > 0 let (u+, vs) be a 
solution of the minimum problem 


. € 
min f [e Wve +5 [ul 
(wv)e(Wi2(0)} Jo 2 


each) +|u— gP dx [27] 


From the basic properties of [-convergence it 
follows that there exists a sequence ¢,—0 such 
that uz, converges in L'(Q) to a solution u of [26], 
so that (u, J,,) is a solution of [24]. 

The approximation of the solutions of [24] based 
on [27] has been used to construct numerical 
algorithms for image segmentation. 


Free discontinuity problems similar to [24] appear 
in the mathematical treatment of Griffith's model in 
fracture mechanics. In this case, u is a vector-valued 
function, which represents the deformation of an 
elastic body, the first term in [24] is replaced by a 
more general integral functional which represents 
the energy stored in the elastic region Q\K, while the 
second term is interpreted as the energy dissipated to 
produce the crack K. An approximation based on 
minimum problems similar to [27] has been used to 
construct numerical algorithms to study the process 
of crack growth in brittle materials. 

An important research line, connected with these 
problems, has been developed in the last years to 
derive the macroscopic theories of fracture 
mechanics from the microscopic theories of inter- 
atomic interactions. Using I-convergence, some 
theories expressed in the language of continuum 
mechanics can be obtained as limits of discrete 
variational models on lattices, as the distance 
between neighboring points tends to zero. 


See also: Convex Analysis and Duality Methods; Elliptic 
Differential Equations: Linear Theory; Free Interfaces 
and Free Discontinuities: Variational Problems; 
Geometric Measure Theory; Image Processing: 
Mathematics; Variational Techniques for Ginzburg- 
Landau Energies; Variational Techniques for 
Microstructures. 
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Introduction 


Poincaré duality is fundamental in the study of 
manifolds. In the case of an orientable closed 
manifold X, this duality appears as an isomorphism 


V: H*(X;Z)  H, 4(X;Z) 


between integral cohomology and homology. The 
map w is defined by cap product with a chosen 
orientation class. This article focuses on dimension 
n=4, where Poincaré duality induces a bilinear 
form O on H2(X; Z) by use of the Kronecker pairing 


Q(&£€) = (y^ (,£)eZ 


One of the outstanding achievements of modern 
topology, the classification of simply connected 
topological 4-manifolds by Freedman (1982), can 
be phrased in terms of the intersection pairing O. 
Indeed, two simply connected differentiable 
4-manifolds X and X' are orientation preservingly 
homeomorphic if and only if the associated 
pairings O and O’ are equivalent. Freedman’s 
classification scheme has been extended to also 
cover a wide range of fundamental groups, 
resulting in a fair understanding of topological 
4-manifolds (Freedman and Quinn 1990). 

When it comes to differentiable 4-manifolds, the 
situation changes drastically. On the one hand, there is 
an abundance of topological 4-manifolds which do not 
admit a differentiable structure at all. On the other 
hand, there also are topological 4-manifolds support- 
ing infinitely many distinct differentiable structures. 
A classification of differentiable 4-manifolds up to 
differentiable equivalence seems out of reach of 
current technology, even in the most simple cases. 

The discrepancy between topological and differen- 
tiable 4-manifolds was uncovered by gauge-theoretic 
methods, applying the concepts of instantons and of 
monopoles. In order to study these, one has to equip a 
4-manifold both with a Riemannian metric and some 


additional structure: a Hermitian rank-2 bundle in 
the case of instantons and a spin‘-structure in the case 
of monopoles. Given such data, instantons and 
monopoles arise as solutions to partial differential 
equations the gauge equivalence classes of which form 
finite-dimensional moduli spaces. As it turns out, 
these moduli spaces encode significant information 
about the differentiable structures of the underlying 
4-manifolds. 

A decoding of such information contained in the 
instanton moduli and in the monopole moduli is 
achieved through Donaldson invariants and Seiberg- 
Witten invariants, respectively. This article outlines 
these theories from a mathematical point of view. 


Instantons and Donaldson Invariants 


Let X denote a closed, connected, oriented differ- 
entiable Riemannian 4-manifold. We will consider a 
principal bundle P over X with fiber a compact Lie 
group G with Lie algebra q. Connections on P form 
an infinite-dimensional affine space .A(P)— Ao + 
O!(X;ap) modeled on the vector space of 1-forms 
with values in the adjoint bundle 


Gp = P Xaaa) 8 


The curvature Fa € Q?(X,ap) of a connection A is a 
aqp-valued 2-form satisfying the Bianchi identity 
DAFA4 —0. The group G of principal bundle auto- 
morphisms of P acts in a natural way on the space 
of connections with quotient space 


B(P) = A(P)/G 
The Yang-Mills functional 
YM: A(P) — Rso 


associates to a connection A the norm square 


Fall? = -| u(t ^ *F 4) 


of its curvature. Here * denotes the Hodge star 
operator defined by the metric on X and the 
orientation. The metric —tr:a 9q—R is Ad(G)- 
invariant and hence YM is invariant under the 
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action of G. In particular, the Yang—Mills functional 
descends to a function on the space B(P) of a gauge 
equivalence class of connections. 

The Euler-Lagrange equations for the critical 
points of YM, called Yang-Mills equations, are of 
the form 


DA(*FA) = 0 
and can be derived easily from the formula 
FA,4 = FA + Da(a) + [a ^a] 
Satisfying the equations 
Da(*F4)=0 and Da4(FA)-0 


a Yang-Mills connection is characterized by the fact 
that it is harmonic with respect to its own Laplacian. 

The bundle A*T*X of 2-forms on X decomposes 
into (+1)-eigenbundles of the Hodge operator. This 
orthogonal splitting leads to a decomposition of 
curvature forms 


Fa = F} 十 FA 


into self-dual and anti-self-dual components. The 
differential form —(1/4z?)tr(FA ^ F4) represents a 
characteristic class of the principal bundle P. In 
particular, the integral 


K(P) = -J tr(FA A Fa) 
X 
= |IFZI^ — ILE; I 


is independent of the connection A. The Yang-Mills 
functional therefore is bounded 


YM(P) > |&(P)| 


and attains this minimum at connections A which 
satisfy the equation 


*F 4 一 +F4 


Such connections are either self-dual, anti-self-dual 
or both, that is, flat, depending on whether «(P) is 
negative, positive, or zero. The moduli space of 
instantons on P is the subset of minima of the Yang- 
Mills functional 


M(P) = YM ([&(P)]) c B(P) 


The moduli space thus consists of gauge equivalence 
classes of connections which are either self-dual or 
anti-self-dual. Donaldson theory indeed considers 
anti-self-dual connections on principal bundles with 
structure group PU(2) — SO(3). 

The Hodge * operator induces a decomposition of 
the second cohomology 


H^(X) = H(X) @ H*(X) 


into (+1)-eigenspaces of dimension b* and b^. 
Unless specified differently, cohomology groups are 
meant with real coefficients. In order to simplify the 
exposition, we will assume X to be simply con- 
nected. The Donaldson invariants then are defined if 
b* is odd and greater than 1. 

A "homology orientation" consists of an orienta- 
tion of H?(X) and an integral homology class 
c € H»(X; Z). The Donaldson invariant Dy, — D; 
is defined after fixing such a homology orientation. 
It is a linear function 


D,: A(X) —^R 
where A(X) is the graded algebra 
A(X) = Sym,(Ho(X) & H2(X)) 


in which H;(X) has degree (1/2)(4 — i). The sig- 
nificance of D, is its functoriality 


Dy. ro)(f (a)) = Dx,c(@) 


under diffeomorphisms f:X— X' which preserve 
both orientation and homology orientation. Switch- 
ing the orientation of H? (X; R) reverses the sign of 
D,. Similarly, 


"T (—1) lee" D. 


f c=¢ € 2H5;(X,Z) c H2(X;Z). 
The construction of this invariant makes use of 
the following facts: 


1. An SO(3) principal bundle P over X is 
determined by its first Pontrjagin number p;(P) and 
its Stiefel- Whitney class w2(P) € H?(X; Z/2). As X 
is simply connected, this Stiefel-Whitney class 
admits integer lifts. Let c be such a lift and let c? 
be shorthand for the intersection pairing O(c,c). 
A pair (p1,w2) is realized by a principal bundle 
provided it satisfies the relation pı = c? modulo 4. 

2. If b* is nonvanishing, then for generic metrics 
on X, the moduli space M(P) is a manifold of 
dimension 


—2p4 (P) — 3(1 -- b*) 


This follows from a transversality theorem whose 
main ingredient in the Sard-Smale theorem. The 
dimension is computed by use of the Atiyah-Singer 
index theorem: to an anti-self-dual connection A on 
P there is an associated elliptic complex 


D 
0 一 O° (xX; qp) 一 Q (X: ap) 
D* 
—5 7 (X; Gp) — 0 


where ('(X;qp) denotes qp-valued i-forms on X. 
This complex describes the tangential structure of 
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the moduli space at the equivalence class of A. The 
space 2'(X;qp) is the tangent space of A(P) at A, 
°(X;qp) is the tangent space of the group 9 at the 
identity, and D4 is the differential of the orbit map. 
The differential operator D is the linearization of 
the anti-self-duality map 


at— FX, = D} (a) + [a^a]* 


3. The moduli space M(P) can be oriented if it is 
a manifold. The orientation depends on an orienta- 
tion of H*(X) and on a U(2)-principal bundle which 
has P as its PU(2)-quotient bundle. It is determined 
by an integer lift of w2(P). The elliptic complex 
above then can be compared with a corresponding 
elliptic complex where the differentials are given by 
a complex Dirac operator. This leads to an almost- 
complex structure on the tangent space for each 
point in the moduli space and in particular to an 
orientation on the moduli space itself. 

4. Over the product M(P) x X there is a universal 
PU(2)-bundle P with first Pontrjagin class p;(P). 
Taking slant product with the class —(1/4)p;(P) 
results in a homomorphism 


n: H(X) ^ H*"(M(P)) 


5. The moduli space M(P) in general is noncom- 
pact. There is an Uhlenbeck compactification M(P) 
describing “ideal instantons." Such an ideal instanton 
consists of an element (x1,...,x,) € Sym,(X) and an 
anti-self- dual connection A' on the principal bundle 
P' on X with w2(P’)=w2(P) for which the equality 


pi(P) — pi(P) =4n 


of Pontrjagin numbers holds. Uhlenbeck’s compact- 
ness theorem describes what happens if a sequence 
of anti-self-dual connections has no convergent 
subsequence: after passing to a subsequence, the 
sequence converges to an anti-self-dual connection 
on the restriction of P to X\{x1,...,x,}. This limit 
connection extends to a connection A’ on the 
principal bundle P'. The functions |F4||^ on X 
converge to the measure 


n 
[Fa |? 十 》， 87 ôx. 
=l 


The compactification M(P) is a stratified space and 
not usually a manifold. If w2(P)40, then the 
singular set of codimension at least 2 and thus the 
space M(X) carries a fundamental class. In the case 
w2(P)=0, such a fundamental class in general can 
only be defined if —p;(P) > 4-- 3b*. In practice, 
this problem can be circumvented by blowing up X 
and considering bundles with w2(P) #0 over the 
connected sum X44CP'. Note that the complex 


projective plane CP? as a complex manifold carries 
a natural orientation. The notation CP indicates a 
reversed orientation. 

6. The classes p(a) € H^(M(P)) for a € H?(X) 
extend over the compactification. The same holds for 
the class p(x), where x € Ho(X; Z) is the generator 
corresponding to the orientation, as long as w2(P) Z 0. 
Otherwise, there are certain dimension restrictions. 
However, the same blow-up trick as mentioned above 
allows to handle the case tw;(P) — 0 as well. 


Now fix an element c € H(X; Z) and let 


denote the disjoint union of all moduli spaces 
of anti-self-dual connections on principal PU(2)- 
bundles P, ; whose second Stiefel- Whitney class is 
Poincaré-dual to c modulo 2 and whose Pontrjagin 
number equals —d — (3/2)(b* 4- 1). 

Our assumption of b* being odd corresponds 
to the fact that the dimension 2d of the moduli 
space M(c,d) is even and congruent to —c* + 
(1/2)(1+ b*) modulo 4. Neglecting the difficulties 
in the case w2(P)=0 mentioned above, we may use 
the cup product on H*(M,) to extend jz to an 
algebra homomorphism 


p: A(X) ^ H* (Me) 


The Donaldson invariant D, is nonzero only on 
elements z of A(X) whose total degree d is congruent 
to —c^--(1/2)(1-- b*) modulo 4. For such an 
element it is defined by 


D.(z) = (u(z), M(P.a)) = | 


M(P.a) 


H(z) 


The Donaldson series D, is defined as a formal 
power series 


for a € H2(X) and á — (1 + (x/2))a. 


Computations and Structure Theorems 


The first results about these invariants are due to 
S Donaldson. He proved both a vanishing and a 
nonvanishing theorem (Donaldson and Kronheimer 
1990): 


Theorem 1 If both b^(X) > 0 and b*(Y) > 0, then 
all Donaldson invariants vanisb for tbe connected 
sum X#Y. 
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Theorem 2 If c represents a divisor on a complex 
algebraic surface X and a represents an ample 
divisor, then 


D(a") ZO forr>0 


The second theorem is a consequence of the fact that in 
the case of an algebraic surface the instanton moduli 
can be described in algebraic geometric terms: the 
moduli space M(P, 4) associated to the metric induced 
from the Fubini-Study metric on CP" by an embedding 
X — CP" carries the structure of a projective variety. 
This variety is reduced and of complex dimension d, as 
soon as d is large enough. Furthermore, j/(d) is the first 
Chern class of an ample line bundle. 

The translation of instanton moduli into algebraic 
geometry uses two steps: suppose the first Chern class 
of a U(r)-principal bundle P on a Kahler surface is also 
the first Chern class of a holomorphic line bundle. Then 
the absolute minima of the Yang-Mills functional are 
achieved by Hermite—Einstein connections. These are 
connections for which the Ricci curvature is a constant 
multiple of the identity. The second step, the transla- 
tion from differential geometry into algebraic geome- 
try, is called the Kobayashi-Hitchin correspondence, 
which again was proved by Donaldson. 

The Donaldson invariants have been computed 
for a number of 4-manifolds. A simply connected 
4-manifold is said to have simple type, if the relation 


D,(x^z) = 4D.(z) 


is satisfied by its Donaldson invariant for all z € A(X) 
and c € H5(X; Z). It is known that this simple type 
condition holds for many 4-manifolds. Indeed, it is an 
open question whether there are 4-manifolds which 
are not of simple type. For manifolds of simple type 
the Donaldson series D, completely determines the 
Donaldson invariant D,. A main result is due to 
Kronheimer and Mrowka (1995): 


Theorem 3 Let X be a simply connected 4-manifold 
of simple type. Then, there exist finitely many basic 
classes k1,..., Ky € H(X; Z) such that 


D, = exp(Q/2) » (C1) ps exp(K;) 
i=] 


as analytic functions on H(X). The numbers a; are 
rational and each basic class r; is characteristic, that 
is, it satisfies o? = Q(oa, kj) modulo 2 for all 
a € H5(X; Z). The homology class &; in this formula 
acts on an arbitrary bomology class by intersection. 


The geometric significance of the basic classes is 
underlined by the following theorem (Kronheimer 
and Mrowka 1995): 


Theorem 4 If a € H2(X:Z) is represented by an 
embedded surface of genus g with self-intersection 
o^ > 2, then for each basic class & the following 
adjunction inequality is satisfied: 


2g -2 > o? + |Q(s, 0) 


There are many 4-manifolds for which the Donaldson 
series have been computed (Friedman and Morgan 
1997). The basic classes for complete intersections, for 
example, are the canonical divisor and its negative. 
Another example is given by elliptic surfaces. Let 
E(n;p,q) be a minimal elliptic surface, that is, a 
holomorphic surface admitting a holomorphic map to 
CP! with generic fiber f an elliptic curve. For any 
numbers n, p, and q with p < q coprime, there exists 
such a simply connected elliptic surface with Euler 
characteristic 12” and two multiple fibers of multi- 
plicity p and q, respectively. The Donaldson series of 
E(n; p,q) for c — 0 then is given by 


d Q sinh” (f) 
al (5) sinh(f /p) sinh(f/q) 


Another important formula relates the Donaldson 
series D a manifold X of simple type and the 
Donaldson series D of the blow-up X#CP?: 


D, = D, - exp(—e*/2) cosh(e) 
Dee = —D, - exp(—e? /2) sinh(e) 


Here e € H5(CP?; Z) denotes a generator. Indeed, a 
more general blow-up formula is known which 
relates the Donaldson invariants for X and its 
blow-up even in case X is not of simple type. This 
formula, due to Fintushel and Stern (1996), involves 
Weierstraf$ sigma-functions. 

The instanton moduli space carries nontrivial 
information about 4-manifolds even in the case 
b*(X) € 1. However, one has to deal with singula- 
rities in the moduli space. Let us first consider the 
case b*(X)—0. If the intersection form on X is 
negative definite, the instanton moduli spaces in 
general are bound to have singularities. Indeed, 
Donaldson examined the case with the Pontrjagin 
number p;(P)— —4 and w2(P)=0. In this case, the 
moduli space for a generic metric on X will be an 
orientable smooth manifold except at isolated 
singular points. The singularities are cones over 
CP? and they correspond to reducible connections, 
that is, reductions of the structure group of P to 
U(1). These reductions are in bijective correspon- 
dence to pairs +a € H5(X;Z) with o?— —1. The 
Uhlenbeck compactification of the moduli space 
thus leads to an oriented cobordism between X and 
the disjoint union LI,CP? over all pairs +a in 
H2(X;Z) of square -1. As the signature of a 


manifold is an invariant of oriented cobordism, 
there have to be b^ many pairs ta of square (—1) in 
H(X; Z) and, in particular, the intersection form O 
is represented by the negative of the identity matrix 
(Donaldson 1983): 


Theorem 5 The intersection form on a differenti- 
able manifold with negative-definite intersection 
form is diagonal. 


Indeed, from rank 8 on there are lots of definite 
unimodular forms which are not diagonal. By 
Freedman’s (1982) classification, any unimodular 
form is realized as the intersection form of a simply 
connected topological manifold. This theorem 
shows that most of these manifolds do not support 
differentiable structures. 

The case b*(X)=1 is also interesting. Here, the 
moduli space is a smooth manifold for a generic 
metric, giving rise to Donaldson invariants. How- 
ever, over a smooth path of metrics, there is in 
general no smooth cobordism of moduli spaces. So 
the invariants depend on the chosen metric. The 
singularities in the cobordisms again correspond to 
classes in H5;(X;Z) with negative square. An 
analysis of these singularities leads to wall-crossing 
formulas describing how different choices of the 
metric do affect Donaldson invariants. The case of 
CP? is special, as there are no elements of negative 
square in H;(CP?; Z). The Donaldson invariants for 
CP? as well as the wall-crossing formulas turn out 
to be closely related to modular forms (Gottsche 
2000). 


Monopoles and Seiberg-Witten Invariants 


A spin'-structure on an oriented Riemannian 
4-manifold is a Spin*(4)-principal bundle P projecting 
to the orthonormal tangent frame bundle P over X 
through the group homomorphism Spin'(4) — SO(4) 
with kernel U(1). The group H*(X;Z) acts freely 
and transitively on the set of all spin*-structures. 
A spin‘-connection is a lift to P of the Levi-Civita 
connection on P. Fixing a background spin*- 
connection Ao, the monopole map 


H: (A,0) — (Dad, Fi — 692, d'a) 


is defined (Witten 1994) for spin*-connections A € 
Ap + 2'(X;iR) and positive spinors ¢. Here, Da 
denotes the complex Dirac operator associated to A 
and d'a for a € Q'(X;iR) is the adjoint of the de 
Rham differential on forms. The section $0* of the 
traceless endomorphism bundle of positive spinors is 
viewed as a self-dual 2-form on X. 
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In case the first Betti number vanishes, this map — 
after suitable Sobolev completion — becomes a map 
between Hilbert spaces u: A— C which is a compact 
deformation of a linear Fredholm map. The 
Weitzenbéck formula can be used to show that 
preimages under u of bounded sets in C are bounded 
in A. Furthermore, jz is U(1)-equivarant, where U(1) 
acts by complex multiplication on spinors and 
trivially on forms. If b1(X) > 0, the monopole map 
is a map between Hilbert space bundles over the 
torus H'(X)/H!(X;Z). These properties of the 
monopole map allow for an interpretation in terms 
of stable homotopy (Bauer 2004): 


Theorem 6 If the first Betti number of X vanishes, 
then u defines an element 


[u] € 2; ^ (S9) 


in an equivariant stable bomotopy group of spberes. 
Tbe index i=indD, 一 H^ (X) as an element of tbe 
real representation ring RO(U(1)) is determind by 
the analytic index of the linearization of y. 


In the case b*(X)>1, these equivariant 
stable homotopy groups can be identified with 
nonequivariant stable — cohomotopy groups 
Te -l(CP^-1). Here, d denotes the index of the 
complex Dirac operator ind D4. Fixing an orienta- 
tion of H2 (X) results in a Hurewicz homomorphism 


h: wa (CP = H* " (CP4-!;Z) 
If b*(X) is odd, the image 
b([u]) = SW(X)t" 77 


is an integer multiple of a power of the generator 
t € H'(CP4-!; Z). This integer SW(X) is known as 
the Seiberg-Witten invariant (Witten 1994). 

This invariant alternatively can be defined by 
considering the moduli space M(a) = ji^ (a). Assum- 
ing b* > 0, this is a smooth oriented manifold with a 
free U(1)-action for generic a € Q!(X;;R). The 
Seiberg-Witten invariant is the characteristic 
number obtainable by these data. In general, the 
stable homotopy invariant [jj] encodes global inform- 
ation about the monopole map, which cannot be 
recovered by only considering the moduli space. In 
case the spin‘-structure is associated to an almost- 
complex structure, however, there is a fortunate 
coincidence: the Hurewicz homomorphism in this 
case is an isomorphism. So for almost-complex 
spint-structures, the invariants [w] and SW carry 
the same information. 

The Seiberg- Witten invariants turn out to be directly 
computable for Káhler manifolds and to some degree 
also for symplectic manifolds (Taubes 1994). Indeed, 
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the following theorem follows from arguments of 
Witten and of Taubes: 


Theorem 7 Let X be a 4-manifold with b* >1 and 
bi=0 which can be equipped with a Kabler or a 
symplectic structure. If |u] is nonvanishing for a 
spin*-structure on X, then the spin‘-structure is 
associated to an almost-complex structure. For the 
canonical spin‘-structure on X the Seiberg-Witten 
invariant is +1. 


Seiberg—Witten invariants and Donaldson invar- 
iants are closely related: Witten gave physical 
arguments that an equality of the form 


D = 2* exp(O/2) - y SW(a) exp(a) 


should hold for the Donaldson series for c=0 of 
a simply connected manifold of simple type. Here, 
a € H^(X;Z) denotes the first Chern class of the 
complex determinant line bundle. This first Chern 
class characterizes spin‘-structures in the simply 
connected case. The number k is related to the 
signature g and the Euler characteristic x of the 
manifold X by the formula 


4k =110+7x +2 


A mathematical proof of this formula is known in 
special cases (Feehan and Leness 2003). 

As is the case for Donaldson invariants, the Seiberg- 
Witten invariants vanish for connected sums X#Y if 
both b*(X) > 0 and b*(Y) > 0 holds. This is not the 
case for the stable homotopy refinement as follows 
from the following theorem (Bauer 2004). 


Theorem 8 For a connected sum X#Y of 
4-manifolds the stable equivariant homotopy 
invariants are related by smash product 


[ux#y] = [ux] A [Hy] 


As an example application, consider connected sums of 
elliptic surfaces of the form E(2; p, q). Now suppose X 
and X' are each connected sums of at most four copies of 
such elliptic surfaces. Then X and X’ are diffeomorphic 
if and only if the summands were already diffeomorphic. 
This contrasts to the fact that the connected sum 
E(2n; p, q)#CP* is diffeomorphic to a connected sum 
of 4n — 1 copies of CP? and 20n — 1 copies of CP’, 
independently of p and q. 

As a final application, we consider the case of spin 
manifolds. If the manifold X is spin, then the 
intersection form Q is even, that is, O(o, a) =0 mod 
2 for a € H(X, Z). According to Rochlin's theorem, 
the signature of a spin 4-manifold is divisible by 16. 
The monopole map y for the spin structure admits 
additional symmetry. It is Pin(2)-equivariant. The 
nonabelian group Pin(2) appears as the normalizer 


of the maximal torus SU(2). Methods from equivar- 
iant K-theory lead to Furuta's (2001) theorem: 


Theorem 9 Let X be a spin 4-manifold. Then 
x(X) > le(X)| 


Manifolds with Boundary 


Both Donaldson invariants and Seiberg—Witten invar- 
iants to some extent satisfy formal properties which 
fit into a general conceptual framework known as 
“topological quantum field theories (TQFTs).” Such 
a TQFT in 34-1 dimensions is a functor on the 
cobordism category of oriented 3-manifolds to the 
category of, say, vector spaces over a ground field: it 
assigns to an oriented 3-manifold Y a vector space 
b(Y). To a disjoint union it assigns 


h(Yı UY2) = b(Yi) & b(Y2) 


Reversing orientation corresponds to dualizing 


Viewing a four-dimensional manifold X with 
boundary ðX = Yı U Yə formally as a morphism 
from Y, to Yə, this functor associates to X a 
homomorphism 


H(X) : b(Yi) ^ b(Y2) 


that is, an element (X) € b(Y;11Y;). The most 
important feature is the composition law 


HX Uy X2) = H(X2) O (X1) 


So if a cobordism X from Y, to Y? can be decomposed 
as a cobordism X4 from Y, to an intermediate 
submanifold Y and a cobordism X; from Y to Y;, 
then the homomorphism H(X) can be computed from 
(X1) and (X5) as their composition. 

Donaldson invariants and Seiberg-Witten invar- 
iants fit neatly into the framework of a TQFT if one 
restricts to 3-manifolds which are disjoint unions of 
homology 3-spheres. In both the instanton and 
the monopole case, the vector spaces h(Y) are 
Floer homology groups. The construction of Floer 
homology carries the Morse theory description of 
the homology of a finite-dimensional manifold over 
to an infinite-dimensional setting. In the instanton 
case, one considers the Chern-Simons function 


2 
CS(a) = -gu f (a ^ta e $a nana) 


This function is defined on the space of gauge 
equivalence classes of SU(2)-connections on Y. Note 
that for a homology 3-sphere, any SU(2) or PU(2) 


principal bundle over Y is trival. Choosing a 
trivialization, a connection becomes identified with 
a Lie-algebra-valued 1-form a. Critical points for the 
Chern-Simons functional lead to generaters in a 
chain complex the homology of which then gives the 
Floer groups. Such critical points correspond to flat 
connections on Y. The Floer homology groups HF, (Y) 
are 7,/8-graded in the SU(2) case and 7,/4-graded in 
the SO(3) case. If X is a 4-manifold with bi(X)=0 
and b*(X) > 1 and such that the boundary OX is a 
disjoint union of homology 3-spheres, then the 
Donaldson invariants are linear maps 


D, : A(X) — HF, (8X) 


These invariants satisfy a composition law on the 
subring of A(X) generated by two-dimensional 
homology classes (Donaldson 2002). 

In the monopole case, one considers a Chern- 
Simons-Dirac functional 


CSD(a, v) = ; (fo. D,wv)dvol — Ja ^ da) 


and obtains integer graded Floer homology groups. 
Details and proofs of the relevant composition laws 
are announced. 


See also: Floer Homology; Four-Manifold Invariants and 
Physics; Gauge Theory: Mathematical Applications; 
Instantons: Topological Aspects; Moduli Spaces: An 
Introduction; Several Complex Variables: Basic 
Geometric Theory; Topological Quantum Field Theory: 
Overview. 
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Introduction 


One of the most exciting properties of string theory, 
which led ten years ago to the formulation of the M 
theory as the unique theory unifying all interactions, 
has been the discovery that type II theories, besides a 
perturbative spectrum consisting of closed-string 
excitations, contain also a nonperturbative one 
consisting of “solitonic” p-dimensional objects 
called Dp branes. They are characterized by two 
important properties. They are coupled to closed- 
string states as the graviton, the dilaton, and the 
R-R (p + 1)-form potential, and are described by a 
classical solution of the low-energy string effective 
action. Their dynamics is, on the other hand, 
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described by open strings having the endpoints 
attached to their world volume and therefore 
satisfying Dirichlet boundary conditions in the 
directions transverse to their world volume. This is 
the reason why they are called D (Dirichlet) branes. 
Since the lightest open-string excitation corresponds 
to a gauge field, they have a gauge theory living on 
their world volume. This twofold description of 
D-branes has opened the way to study both the 
perturbative and nonperturbative properties of the 
gauge theory living on their world volume from 
their dynamics in terms of closed strings. With the 
addition of the decoupling limit, these two proper- 
ties have led to the Maldacena (1998) conjecture of 
the equivalence between the maximally supersym- 
metric and conformal VV =4 super Yang-Mills and 
type IIB string theory on AdS; x S. 

They have also been successfully applied to less 
supersymmetric and nonconformal gauge theories 
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that live on the world volume of fractional and 
wrapped branes. For general reviews of various 
approaches see Bertolini et al. (2000), Herzog et al. 
(2001), Bertolini (2003), Bigazzi et al. (2002), and 
Di Vecchia and Liccardo (2003). Also in these cases, 
one has constructed a classical solution of the 
supergravity equations of motion corresponding to 
these more sophisticated branes. These equations 
contain not only the supergravity fields present in 
the bulk ten-dimensional action but also boundary 
terms corresponding to the location of the branes. It 
turns out that in general the classical solution 
develops a naked singularity of the repulson type 
at short distances from the branes. This means that 
at short distances, it does not provide a reliable 
description of the branes. In the case of NV —2 
supersymmetry, this can be explicitly seen because 
of the appearance of an enhancon located at 
distances slightly higher than the naked singularity 
(Johnson et al. 2000). The enhancon radius corre- 
sponds, in supergravity, to the distance where a 
brane probe becomes tensionless, and, in the gauge 
theory living on the branes, to the dynamically 
generated scale Agcp. Then, since short distances 
in supergravity correspond to large distances in 
the gauge theory, as implied by holography, the 
presence of the enhangon and of the naked 
singularity does not allow to get any information 
on the nonperturbative large-distance behavior of 
the gauge theory living on the D-branes. Above the 
radius of the enhancon, instead, the classical solu- 
tion provides a good description of the branes and 
therefore it can be used to get information on the 
perturbative behavior of the gauge theory. This 
shows that, if we want to use the D-branes for 
studying the nonperturbative properties of the gauge 
theory living on their world volume, we must 
construct a classical solution that has no naked 
singularity at short distances in supergravity. We 
will see in a specific example that it will be possible 
to deform the classical solution, eliminating the 
naked singularity, and use it to describe nonpertur- 
bative properties as the gaugino condensate. 

In this article, we review some of the results obtained 
by using fractional D3 branes of some orbifold and D5 
branes wrapped on 2-cycles of some Calabi-Yau 
manifold. The analysis of the supersymmetric gauge 
theories living on the world volume of these D-branes 
will be based on the gauge/gravity relations that relate 
the gauge coupling constant and the 6-angle to the 
supergravity fields (see, e.g., reference Di Vecchia et al. 
(2005) for a derivation of them): 


4 > 二 
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and 


1 
Gy™ = Sc xh (Cy + CoB3) [2] 


where C; is the 2-cycle where the branes are 
wrapped. 

In the next section, we will describe the case of 
the fractional D3 branes of the orbifold C*/Z and 
show that the classical solution corresponding to a 
system of N D3 and M D7 branes reproduces the 
perturbative behavior of N=2  super-QCD. 
Then, we will consider DS branes wrapped on 2- 
cycles of a Calabi-Yau manifold described by the 
Maldacena-Nünez classical solution (Maldacena 
and Nünez 2001, Chamseddine and Volkov 1997) 
and show that in this case we are able to reproduce 
the phenomenon of gaugino condensate and to 
construct the complete 6-function of A/ — 1 super 
Yang-Mills. 


Fractional D3 Branes of the Orbifold 
C*/Z2 and N —2 Super-QCD 


In this section, we consider fractional D3 and D7 
branes of the noncompact orbifold C^/Z; in order 
to study the properties of A —2 super-QCD. We 
group the coordinates of the directions (x*,..., x?) 


transverse to the world volume of the D3 
brane where the gauge theory lives, into three 
complex quantities: z4 —x* -- ix?, z2 —x$ + ix’, 


z3—x"--ix?. The nontrivial generator b of Z2 


acts as Z2 —^ — Z2, Z3 — — 23, leaving 24 invariant. 
This orbifold has one fixed point, located at 
22=23=0 and corresponding to a vanishing 
2-cycle. Fractional D3 branes are DS branes 
wrapped on the vanishing 2-cycle and therefore 
are, unlike bulk branes, stuck at the orbifold fixed 
point. By considering N fractional D3 and M 
fractional D7 branes of the orbifold C*/Z2, we are 
able to study N=2 super-QCD with M hyper- 
multiplets. In order to do that, we need to 
determine the classical solution corresponding to 
the previous brane configuration. For the case of 
the orbifold C*/Z2, the complete classical solution 
is found in Bertolini et al. (2002b); see also 
references therein and Bertolini et al. (2000) for a 
review on fractional branes. In the following, we 
write it explicitly for a system of N fractional D3 
branes with their world volume along the direc- 
tions x?,x!,x?, and x? and M fractional D7 branes 
containing the D3 branes in their world volume 
and having the remaining four world-volume 
directions along the orbifolded ones. The metric, 


the 5-form field strength, the axion, and the 
dilaton are given by 


ds? =H na dx? dx? 
+ H1? (bim dx' dx" -- e ?6;dx' dx’) — [3] 


Fs) — d(H^! dx? ^ --- ^ dx?) 
t *d(H^! dx? a --- A dx?) [4] 


T= pie? = (1 = M& log?) 
2r € 


z= x*- ix = ye" 


[5] 


where the self-dual field strength Fis) is given in 
terms of the NS-NS and R-R 2-forms By and C 
and of the 4-form potential C4 by Fis; — dC4 + C2 ^ 
dB;. The warp factor H is a function of the 
coordinates (x*,...,x?) and e is an infrared cutoff. 
We denote by a and 9 the four directions corre- 
sponding to the world volume of the fractional D3 
brane, by / and m those along the four orbifolded 
directions x$, x’, xê, and x’, and by i and / the 
directions x^ and x? that are transverse to both the 
D3 and the D7 branes. The twisted fields are instead 
given by By =w2b, C2 = wzc where w is the volume 
form of the vanishing 2-cycle and 
T 
petu Um LIN M ag? 
2 T 6 [3 


c + Cob = —2mo'0g,(2N — M) 


The expression of H (Kirsch and Vaman 2005) shows 
that the previous solution has a naked singularity of 
the repulson type at short distances. On the other 
hand, if we use a brane probe approaching from 
infinity the stack of branes, described by the previous 
classical solution, it can also be seen that the tension 
of the probe vanishes at a distance that is larger than 
that of the naked singularity. The point where the 
probe brane becomes tensionless is called “enhangon” 
(Johnson et al. 2000) and at this point the classical 
solution does not describe anymore the stack of 
fractional branes. 

Let us now use the gauge/gravity relations given in 
the introduction, to determine the coupling con- 
stants of the world-volume theory from the super- 
gravity solution. In the case of fractional D3 branes 
of the orbifold C^/Z5, that is characterized by one 
single vanishing 2-cycle C2, the gauge coupling 
constant given in eqn [1] reduces to 
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By inserting the classical solution in eqns [7] and [2], 
we get the following expressions for the gauge coupling 
constant and the Gy™m angle (Bertolini et al. 2002b): 


1 1 2N-M, y 
— — RÓÀÀs EH — — log — 
f. STE, 167? e [8] 
Pym = —9(2N — M) 


Notice that the gauge coupling constant appearing 
in the previous equation is the “bare” gauge 
coupling constant computed at the scale m ~ y/a’, 
while the square of the bare gauge coupling constant 
computed at the cutoff A ~ e/a is equal to 87g,. 

In the case of an N —2 supersymmetric gauge 
theory, the gauge multiplet contains a complex 
scalar field Ų that corresponds to the complex 
coordinate z transverse to both the world volume 
of the D3 brane and the four orbifolded directions: 
V ~ z/2xo'. This is another example of holographic 
identification between a quantity, V, peculiar of the 
gauge theory living on the fractional D3 branes and 
another one, the coordinate z, peculiar of super- 
gravity. It allows one to obtain the gauge theory 
anomalies from the supergravity background. In 
fact, since we know how the scale and U(1) 
transformations act on V, from the previous gauge/ 
gravity relation we can deduce how they act on z, 
namely 


21a 


V — se ^ i o> z — sez y — sy 


Jg 日 
一 0 十 2a 
Those transformations do not leave invariant the 
supergravity background in eqn [6] and when 
we use them in eqns [7] and [2], they generate the 
anomalies of the gauge theory living on the 
fractional D3 branes. In fact, by acting with 
those transformations in eqns [8], we get 


— ——— log s 
Sim — 8YM 8n [10] 
xM — Ay = 2a(2N = M) 


The first equation generates the 3-function of N — 2 
super-QCD with M hypermultiplets given by 


| 2N— M 
B(gvM) — ECT [11] 
while the second one reproduces the chiral U(1) 
anomaly (Klebanov et al. 2002, Bertolini et al. 20022). 
In particular, if we choose o —27/(2(2N — M)), 
then Oy is shifted by a factor 27. But since yy is 
periodic of 27, this means that the subgroup Z2(2N_M) iS 
not anomalous in perfect agreement with the gauge 
theory results. 
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Wrapped D5 Branes and X = 1 Super 
Yang-Mills 


In this section, we will consider the classical solution 
corresponding to N DS branes wrapped on a 2-cycle 
of a noncompact Calabi-Yau space and we use it to 
study the properties of the gauge theory living on 
their world volume that can be shown to be V = 1 
super Yang—Mills. 

We start by writing the classical solution found in 
Maldacena and Nünez (2001) and Chamseddine and 
Volkov (1997). It has a nontrivial metric: 


ds — e? dx "os (a8 + sin? jag) 
10 — 13 7732 p 


e? : 
+55 lap + 2 Ae = wy, [12] 
a 2-form R-R potential 
1 uoo ano 
c?) = aa E + o) (sin dé’ ^ dọ — sin 0 dé ^ dg) 


— cos Ó' cos Gdp ^ dg 
LIU PEE T” 
+555 [dé ^c sin Fd ^e? [13] 


and a dilaton 


2h sinh 2p 
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where 
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A? = - 5, cos8dg 


with p=Ar and A7-Ng,o'. The left-invariant 
1-forms of S? are 


gl-— £ | cos dg + sin @ sin vdó| 


2 
o= — 5 | sin dé! — sing coswdg| [17] 
o? = - du 十 cos 0’ dó. 


with 0« & <2,0<@< 2r, and 0 € y € 4r. The 
variables 0 and ¢ describe a two-dimensional sphere 
and vary in the range 0 < 0 € x1 and 0 € $ € 2r. 
Before proceeding, here we want to stress the fact that 
the presence of the function a(p) 0 makes the 
solution regular everywhere. This will allow us to use 
it later on to describe the nonperturbative gaugino 
condensate property of N = 1 super Yang-Mills. 

We can now use the previous solution for comput- 
ing the running coupling constant and the 0 parameter 
of N=1 super Yang-Mills (see Di Vecchia et al. 
(2002), Bertolini and Merlatti (2003), and Mück 
(2003) reviewed in Bertolini (2003), Di Vecchia and 
Liccardo (2003), and Imeroni (2003)). In order to do 
that, we have to fix the cycle on which to perform the 
integrals in eqns [1] and [2]. It turns out that this 
2-cycle is specified by 


0—60.5—-—ó, v-—0 [18] 


keeping p fixed. If we now compute the gauge couplings 
on the previous cycle with B2 = Co = 0, we get 


4r? 
= pcoth 2p + 1a(p) cos 19 
and 
Oym = ;| Cı = —N(v4a(p)sinv4- vo) [20] 
279,0 Jg 


where we have kept Z0 for reasons that will 
become clear in a moment. Equation [19] shows 
that the coupling constant is running as a function 
of the distance p from the branes. In order to obtain 
the correct running of the gauge theory, we have to 
find a relation between p and the renormalization 
group scale u. This can be obtained with the 
following considerations. If we look at the previous 
solution, it is easy to see that the metric in eqn [12] 
is invariant under the following transformations: 


pov+2n ifaz0 


21 
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where e is an arbitrary constant. On the other hand, 
Cz is not invariant under the previous transforma- 
tions, but its flux, that is exactly equal to @yy in eqn 
[20], changes by an integer multiple of 27: 
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But since the physics does not change when 
OyM — OyM + 27, one gets that the transformation in 
eqn [22] is an invariance. Notice that also eqn [19] for 


the gauge coupling constant is invariant under the 
transformation in eqn [21]. The previous considera- 
tions show that the classical solution and also the 
gauge couplings are invariant under the Z2 transfor- 
mation if a Æ 0, while this symmetry becomes Z2n if a 
is taken to be zero. As a consequence, since in the 
ultraviolet a(p) is exponentially small, we can neglect 
it and we have a Zon symmetry, while in the infrared 
we cannot neglect a(p) anymore and we have only a 
Z symmetry left. This fits very well with the fact that 
A —1 super Yang-Mills has a nonzero gaugino 
condensate < 入 > that is responsible for the break- 
ing of Z2N into Z2. Therefore, it is natural to identify 
the gaugino condensate precisely with the function 
a(p) # 0 that makes the classical solution regular also 
at short distances in supergravity (Di Vecchia et al. 
2002, Apreda et al. 2002): 


«M» ~A? = pla(p) [23] 


This provides the relation between the renormaliza- 
tion group scale yz and the supergravity spacetime 
parameter p. In the ultraviolet (large p) a(p) is 
exponentially suppressed and in eqns [19] and [20] 
we can neglect it obtaining 


A = pcoth 2p 
Ngy™ [24] 


Oym = — N(w + vo) 


The chiral anomaly can be obtained by performing 
the transformation $— 1 + 2e and getting 


ym — Ov = 2Ne [25] 


This implies that the Z2N transformations corre- 
sponding to c — *k/N are symmetries because they 
shift Gy™ by multiples of 27. 

In general, however, eqns [19] and [20] are only 
invariant under the Z2 subgroup of Z2N correspond- 
ing to the transformation 


y — 1 + In [26] 
that changes fym in eqn [20] as follows: 
ÜyM — OyM — 2Na [27] 
leaving invariant the gaugino condensate: 
cA S mg a e Br /Nevwv eivw/N [28] 
" Ngu 


Therefore, the chiral anomaly and the breaking of 

ZN to Z2 are encoded in eqns [19] and [20]. 
Finally, if we put 7 — 0 in eqn [19], we get 

4T? 

Ngyy, 


= pcoth2p — $a(p) = ptanhp [29] 
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This equation taken together with eqn [23] allows us to 
determine the running coupling constant as a function of 
u. From it, we get (Di Vecchia et al. 2002, Di Vecchia 
and Liccardo 2003) the Novikov-Shifman-Vainshtein- 
Zacharov (NSVZ) -function plus nonperturbative 
corrections due to fractional instantons: 


4n? -2 4x? 
1+—*,—sinh™ 
3 3 Ngým Ney Ngyw (30) 


1— Mega : iu 
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where in the ultraviolet we have approximated 


p with 471? /(Ngzà) coth 41? /(Ng* h 


B(gym) = — 


See also: AdS/CFT Correspondence; Anomalies; 

BF Theories; Brane Construction of Gauge Theories; 
Gauge Theory: Mathematical Applications; 
Noncommutative Geometry from Strings; 
Nonperturbative and Topological Aspects of Gauge 
Theory; Perturbation Theory and its Techniques; 
Seiberg—Witten Theory; Superstring Theories. 
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Introduction 


This article surveys some developments in pure mathe- 
matics which have, to varying degrees, grown out of the 
ideas of gauge theory in mathematical physics. The 
realization that the gauge fields of particle physics and 
the connections of differential geometry are one and the 
same has had wide-ranging consequences, at different 
levels. Most directly, it has led mathematicians to work 
on new kinds of questions, often shedding light later on 
well-established problems. Less directly, various funda- 
mental ideas and techniques, notably the need to work 
with the infinite-dimensional gauge symmetry group, 
have found a place in the general world-view of many 
mathematicians, influencing developments in other 
fields. Still less directly, the work in this area — between 
geometry and mathematical physics — has been a prime 
example of the interaction between these fields which 
has been so fruitful since the 1970s. 

The body of this article is divided into three 
sections: roughly corresponding to analysis, geome- 
try, and topology. However, the different topics 
come together in many different ways: indeed the 
existence of these links between the topics is one of 
the most attractive features of the area. 


Gauge Transformations i 


For a review of the usual foundational material 
on connections, curvature, and related differential 
geometric constructions, the reader is referred to 
standard texts. We will, however, briefly recall 
the notions of gauge transformations and gauge 
fixing. The simplest case is that of abelian gauge 
theory — connections on a U(1)-bundle, say over 
R?. In that case the connection form, representing 
the connection in a local trivialization, is a pure 
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imaginary 1-form A, which can also be identified 
with a vector field A. The curvature of the 
connection is the 2-form dA. Changing the local 
trivialization by a U(1)-valued function g= eX 
changes the connection form to 


A = A — dgg ! = A — idy 


The forms A, A are two representations of the same 
geometric object: just as the same metric can be 
represented by different expressions in different 
coordinate systems. One may want to fix this choice 
of representation, usually by choosing A to satisfy 
the Coulomb gauge condition d*A — 0 (equivalently 
divA — 0), supplemented by appropriate boundary 
conditions. Here we are using the standard Eucli- 
dean metric on R*. (Throughout this article we will 
work with positive-definite metrics, regardless of the 
fact that — at least at the classical level — the 
Lorentzian signature may have more obvious bear- 
ing on physics.) Arranging this choice of gauge 
involves solving a linear partial differential equation 
(PDE) for x. 

The case of a general structure group G is not 
much different. The connection form A now takes 
values in the Lie algebra of G and the curvature is 
given by the expression 


F = dA 4- 4[A, A] 


The change of bundle trivialization is given by a 
G-valued function and the resulting change in the 
connection form is 


A = gAg ! — dgg ! 


(Our notation here assumes that G is a matrix 
group, but this is not important.) Again, we can seek 
to impose the Coulomb gauge condition d'A — 0, 
but now we cannot linearize this equation as before. 

We can carry the same ideas over to a global 
problem, working on a G-bundle P over a general 


Riemannian manifold M. The space of connections on 
P is an affine space A: any two connections differ by a 
bundle-valued 1-form. Now the gauge group G of 
automorphisms of P acts on A and, again, two 
connections in the same orbit of this action represent 
essentially the same geometric object. Thus, in a sense 
we would really like to work on the quotient space 
A/G. Working locally in the space of connections, near 
to some Ag, this is quite straightforward. We represent 
the nearby connections as Ag + a, where a satisfies the 
analog of the coulomb condition 


d,a=0 


Under suitable hypotheses, this condition picks out a 
unique representative of each nearby orbit. However, 
this gauge-fixing condition need not single out a 
unique representative if we are far away from Ao: 
indeed, the space A/G typically has, unlike A, a 
complicated topology which means that it is impos- 
sible to find any such global gauge-fixing condition. 
As noted above, this is one of the distinctive features 
of gauge theory. The gauge group G is an infinite- 
dimensional group, but one of a comparatively 
straightforward kind — much less complicated than 
the diffeomorphism groups relevant in Riemannian 
geometry for example. One could argue that one of 
the most important influences of gauge theory has 
been to accustom mathematicians to working with 
infinite-dimensional symmetry groups in a compara- 
tively simple setting. 


Analysis and Variational Methods 
The Yang-Mills Functional 


A primary object brought to mathematicians atten- 
tion by physics is the Yang—Mills functional 


YM(A) = | IFAl du 
M 


Clearly, YM(A) is non-negative and vanishes if and 
only if the connection is flat: it is broadly analogous 
to functionals such as the area functional in minimal 
submanifold theory, or the energy functional for 
maps. As such, one can fit into a general framework 
associated with such functionals. The  Euler- 
Lagrange equations are the Yang—Mills equations 


dla = 0 


For any solution (a Yang-Mills connection), there is 
a “Jacobi operator" H4 such that the second 
variation is given by 


YM(A + ta) = YM(A) + ? (Haa, a) + O(t?) 
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The omnipresent phenomenon of gauge invar- 
iance means that Yang—Mills connections are never 
isolated, since we can always generate an infinite- 
dimensional family by gauge transformations. Thus, 
as explained in the last section, one imposes the 
gauge-fixing condition d,a=0. Then the operator 
Ha can be written as 


Haa = Aja + [FA. a] 


where A4 is the bundle-valued *Hodge Laplacian" 
dad’ + dada and the expression [F4,a] combines 
the bracket in the Lie algebra with the action of 
A? on A!. This is a self-adjoint elliptic operator 
and, if M is compact, the span of the negative 
eigenspaces is finite dimensional, the dimension 
being defined to be the index of the Yang-Mills 
connection A. 

In this general setting, a natural aspiration is to 
construct a “Morse theory" for the functional. Such 
a theory should relate the topology of the ambient 
space to the critical points and their indices. In the 
simplest case, one could hope to show that for any 
bundle P there is a Yang-Mills connection with 
index 0, giving a minimum of the functional. More 
generally, the relevant ambient space here is the 
quotient A/G and one might hope that the rich 
topology of this is reflected in the solutions to the 
Yang-Mills equations. 


Uhlenbeck’s Theorem 


The essential foundation needed to underpin such a 
“direct method” in the calculus of variations is an 
appropriate compactness theorem. Here the dimen- 
sion of the base manifold M enters in a crucial way. 
Very roughly, when a connection is represented 
locally in a Coulomb gauge, the Yang-Mills action 
combines the L?-norm of the derivative of the 
connection form A with the L?-norm of the 
quadratic term [A, A]. The latter can be estimated 
by the L*-norm of A. If dim M <4, then the 
Sobolev inequalities allow the L^-norm of A to be 
controlled by the L?-norm of its derivative, but this 
is definitely not true in higher dimensions. Thus, 
dim M—4 is the "critical dimension" for this 
variational problem. This is related to the fact that 
the Yang-Mills equations (and Yang-Mills func- 
tional) are conformally invariant in four dimensions. 
For any nontrivial Yang-Mills connection over the 
4-sphere, one generates a one-parameter family of 
Yang-Mills connections, on which the functional 
takes the same value, by applying conformal 
transformations corresponding to dilations of R*. 
In such a family of connections the integrand |F4 |^ — 
the “curvature density" — converges to a 6-function 
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at the origin. More generally, one can encounter 
sequences of connections over 4-manifolds for 
which YM is bounded but which do not converge, 
the Yang-Mills density converging to ó-functions. 
There is a detailed analogy with the theory of 
the harmonic maps energy functional, where the 
relevant critical dimension (for the domain of the 
map) is 2. 

The result of Uhlenbeck (1982), which makes 
these ideas precise, considers connections over a ball 
B" c R”. If the exponent p > 2n, then there are 
positive constants e(p, n), C(p,) > 0 such that any 
connection with ||F||j;(p,, € € can be represented in 
Coulomb gauge over the ball, by a connection form 
which satisfies the condition d*A — 0, together with 
certain boundary conditions, and 


Alle < cll 


In this Coulomb gauge, the Yang-Mills equations 
are elliptic and it follows readily that, in this setting, 
if the connection A is Yang—Mills one can obtain 
estimates on all derivatives of A, 


Instantons in Four Dimensions 


This result of Uhlenbeck gives the analytical basis 
for the direct method of the calculus of variations 
for the Yang-Mills functional over base manifolds 
M of dimension <3. For example, any bundle over 
such a manifold must admit a Yang-Mills connec- 
tion, minimizing the functional. Such a statement 
is definitely false in dimensions 25. For example, 
an early result of Bourguignon and Lawson (1981) 
and Simons asserts that there is no minimizing 
connection on any bundle over S$” for n > 5. The 
proof exploits the action of the conformal trans- 
formations of the sphere. In the critical dimension 
4, the situation is much more complicated. In four 
dimensions, there are the renowned “instanton” 
solutions of the Yang-Mills equation. Recall that 
if M is an oriented 4-manifold the Hodge 
*-operation is an involution of A^T*M which 
decomposes the two forms into self-dual and 
anti-self-dual parts, A?T*M = A+ @ A^. The curva- 
ture of a connection can then be written as 


Pa = Fa E F; 
and a connection is a self-dual (respectively 


anti-self-dual) instanton if F} (respectively F;) is 0. 
The Yang-Mills functional is 


YM(A) = IFAI + [FAI 


while the difference FEI? 一 || E; Il is a topological 
invariant «(P) of the bundle P, obtained by 
evaluating a four-dimensional characteristic class 


on [M]. Depending on the sign of (P), the self- 
dual or anti-self-dual connections (if any exist) 
minimize the Yang-Mills functional among all 
connections on P. These instanton solutions of 
the Yang-Mills equations are analogous to the 
holomorphic maps from a Riemann surface to a 
Kahler manifold, which minimize the harmonic 
maps energy functional in their homotopy class. 


Moduli Spaces 


The instanton solutions typically occur in *moduli 
spaces." To fix ideas, let us consider bundles with 
structure group SU(2), in which case 4«(P)= 
—877c2(P). For each k > 0, we have a moduli space 
Mp, of anti-self-dual instantons on a bundle P, — M+, 
with c2(P}) =k. It is a manifold of dimension 8k — 3. 
The general goal of the calculus of variations in this 
setting is to relate three things: 


1. the topology of the space A/G of equivalence 
classes of connections on P;; 

2. the topology of the moduli space M}, of 
instantons; and 

3. the existence and indices of other, nonminimal, 
solutions to the Yang-Mills equations on P. 


In this direction, a very influential conjecture was 
made by Atiyah and Jones (1978). They considered 
the case when M=S* and, to avoid certain 
technicalities, work with spaces of “framed” con- 
nections, dividing by the restricted group Go of 
gauge transformations equal to the identity at 
infinity. Then, for any k, the quotient A/Go is 
homotopy equivalent to the third loop space (?8? 
of based maps from the 3-sphere to itself. The 
corresponding “framed” moduli space Mg, is a 
manifold of dimension 8k (a bundle over M, with 
fiber SO(3)). Atiyah and Jones conjectured that 
the inclusion .M, — A/Go induces an isomorphism 
of homotopy groups 7; in a range of dimensions 
| < l(k), where /(k) increases with k. This would be 
consistent with what one might hope to prove by the 
calculus of variations if there were no other Yang- 
Mills solutions, or if the indices of such solutions 
increased with k. 

The first result along these lines was due to 
Bourguignon and Lawson (1981), who showed that 
the instanton solutions are the only local minima of 
the Yang-Mills functional over the 4-sphere. Subse- 
quently, Taubes (1983) showed that the index of an 
non-instanton Yang-Mills connection P, is at least 
k+ 1. Taubes’ proof used ideas related to the action 
of the quaternions and the hyper-Kahler structure on 
the M, (see the section on hyper-Kahler quotients). 
Contrary to some expectations, it was shown by 


Sibner et al. (1989) that nonminimal solutions do 
exist; some later constructions were very explicit 
(Sadun and Segert 1992). Taubes’ index bound gave 
ground for hope that an analytical proof of the 
Atiyah-Jones conjecture might be possible, but this 
is not at all straightforward. The problem is that in 
the critical dimension 4 a mini-max sequence for the 
Yang-Mills functional in a given homotopy class 
may diverge, with curvature densities converging to 
sums of 6-functions as outlined above. This is 
related to the fact that the M, are not compact. In 
a series of papers culminating in a framework for 
Morse theory for Yang-Mills functional, Taubes 
(1998) succeeded in proving a partial version of the 
Atiyah-Jones ‘conjecture, together with similar 
results for general base manifolds M*. Taubes 
showed that, if the homotopy groups of the moduli 
spaces stabilize as k — oo, then the limit must be 
that predicted by Atiyah and Jones. Related analy- 
tical techniques were developed for other variational 
problems at the critical dimension involving “critical 
points at infinity." The full Atiyah-Jones conjecture 
was established by Boyer et al. but using geometrical 
techniques: the “explicit” description of the moduli 
spaces obtained from the Atiyah-Drinfeld-Hitchin- 
Manin (ADHM ) construction (see below). A differ- 
ent geometrical proof was given by Kirwan (1994), 
together with generalizations to other gauge groups. 

There was a parallel story for the solutions of the 
Bogomolony equation over R?, which we will not 
recount in detail. Here the base dimension is below 
the critical case but the analytical difficulty arises 
from the noncompactness of R?. Taubes succeeded 
in overcoming this difficulty and obtained relations 
between the topology of the moduli space, the 
appropriate configuration space and the higher 
critical points. Again, these higher critical points 
exist but their index grows with the numerical 
parameter corresponding to k. At about the same 
time, Donaldson (1984) showed that the moduli 
spaces could be identified with spaces of rational 
maps (subsequently extended to other gauge 
groups). The analog of the Atiyah—Jones conjecture 
is a result on the topology of spaces of rational maps 
proved earlier by Segal, which had been one of the 
motivations for Atiyah and Jones. 


Higher Dimensions 


While the scope for variational methods in Yang- 
Mills theory in higher dimensions is very limited, 
there are useful analytical results about solutions of 
the Yang-Mills equations. An important monotoni- 
city result was obtained by Price (1983). For 
simplicity, consider a Yang-Mills connection over 
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the unit ball B" c R”. Then Price showed that the 
normalized energy 


1 
E(A, BO) = rs. FP de 


decreases with r. Nakajima (1988) and Uhlenbeck used 
this monotonicity to show that for each there is an 
e such that if A is a Yang—Mills connection over a ball 
with E(A, B(r)) € e then all derivatives of A, in a 
suitable gauge, can be controlled by E(A, B(r)). Tian 
(2000) showed that if A; is a sequence of Yang-Mills 
connections over a compact manifold M with bounded 
Yang-Mills functional, then there is a subsequence 
which converges away from a set Z of Haussdorf 
codimension at least 4 (extending the case of points in a 
4-manifold). Moreover, the singular set Z is a minimal 
subvariety, in a suitably generalized sense. 

In higher dimensions, important examples of 
Yang-Mills connections arise within the framework 
of “calibrated geometry." Here, we consider a 
Riemannian z-manifold M with a covariant constant 
calibrating form Q € 0%,*. There is then an analog 
of the instanton equation 


Fa =E tA Fa) 


whose solutions minimize the Yang-Mills functional. 
This includes the Hermitian Yang-Mills equation 
over a Kähler manifold (see the section on moment 
maps) and also certain equations over manifolds with 
special holonomy groups (Donaldson and Thomas 
1998). For these “higher-dimensional instantons,” 
Tian shows that the singular sets Z that arise are 
calibrated varieties. 


Gluing Techniques 


Another set of ideas from PDEs and analysis which 
has had great impact in gauge theory involves the 
construction of solutions to appropriate equations 
by the following general scheme: 


1. constructing an “approximate solution,” formed 
from some standard models using cutoff 
functions; 

2. showing that the approximate solution can be 
deformed to a true solution by means of an 
implicit function theorem. 


The heart of the second step usually consists of 
estimates for the relevant linear differential opera- 
tor. Of course, the success of this strategy depends 
on the particular features of the problem. This 
approach, due largely to Taubes, has been particu- 
larly effective in finding solutions to the first-order 
instanton equations and their relatives. (The applic- 
ability of the approach is connected to the fact that 
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such solutions typically occur in moduli spaces and 
one can often “see” local coordinates in the moduli 
space by varying the parameters in the approximate 
solution.) Taubes applied this approach to the 
Bogomolny monopole equation over R? (Jaffe and 
Taubes 1980) and to construct instantons over 
general 4-manifolds (Taubes 1982). In the latter 
case, the approximate solutions are obtained by 
transplanting standard solutions over R^ — with 
curvature density concentrated in a small ball — to 
small balls on the 4-manifold, glued to the trivial 
flat connection over the remainder of the manifold. 
These types of techniques have now become a fairly 
standard part of the armory of many differential 
geometers, working both within gauge theory and 
other fields. An example of a problem where similar 
ideas have been used is Joyce's construction of 
constant of manifolds with exceptional holonomy 
groups (Joyce 1996). (Of course, it is likely that 
similar techniques have been developed over the 
years in many other areas, but Taubes' work in 
gauge theory has done a great deal to bring them 
into prominence.) | 


Geometry: Integrability and Moduli 
Spaces 


The Ward Correspondence 


Suppose that 9 is a complex surface and w is the 
2-form corresponding to a Hermitian metric on S. 
Then $ is an oriented Riemannian 4-manifold and w 
is a self-dual form. The orthogonal complement of w 
in At can be identified with the-real parts of forms 
of type (0,2). Hence, if A is an anti-self-dual 
instanton connection on a principle U(r)-bundle over 
S the (0,2) part of the curvature of A vanishes. This 
is the integrability condition for the 0-operator 
defined by the connection, acting on sections of the 
associated vector bundle E— S. Thus, in the 
presence of the connection, the bundle E is naturally 
a holomorphic bundle over S. 

The Ward correspondence (Ward 1877) builds 
on this idea to give a complete translation of the 
instanton equations over certain Riemannian 
4-manifolds into holomorphic geometry. In the 
simplest case, let A be an instanton on a bundle 
over R*. Then, for any choice of a linear complex 
structure on R*, compatible with the metric, A 
defines a holomorphic structure. The choices of 
such a complex structure are parametrized by a 
2-sphere; in fact, the unit sphere in A*(R*). So, for 
any A € $? we have a complex surface SA and a 
holomorphic bundle over $4. These data can be 
viewed in the following way. We consider the 


projection z:R'x $?— R* and the pull-back 
m(E) to R^ x S2. This pullback bundle has a 
connection which defines a holomorphic structure 
along each fiber S C R* x S? of the other projec- 
tion. The product R* x S? is the twistor space of R* 
and it is in a natural way a three-dimensional 
complex manifold. It can be identified with the 
complement of a line L4, in CP? where the projection 
R'xS$?S? becomes the fibration of CP^VL, 
by the complex planes through Læ. One can see 
then that z*(E) is naturally a holomorphic bundle 
over CP^V L4. The construction extends to the 
conformal compactification S4 of R*. If S* is viewed 
as the quaternionic projective line HP! and we 
identify H* with C* in the standard way, we get a 
natural map z:CP^—HP!. Then CP? is the 
twistor space of $^ and an anti-self-dual instanton 
on a bundle E over $^ induces a holomorphic 
structure on the bundle z*(E) over CP”. 

In general, the twistor space Z of an oriented 
Riemannian 4-manifold M is defined to be the unit 
sphere bundle in Aj,. This has a natural almost- 
complex structure which is integrable if and only if 
the self-dual part of the Weyl curvature of M 
vanishes (Atiyah et al. 1978). The antipodal map 
on the 2-sphere induces an antiholomorphic involu- 
tion of Z. In such a case, an anti-self-dual instanton 
over M lifts to a holomorphic bundle over Z. 
Conversely, a holomorphic bundle over Z which is 
holomorphically trivial over the fibers of the fibra- 
tion Z—M (projective lines in Z), and which 
satisfies a certain reality condition with respect to 
the antipodal map, arises from a unitary instanton 
over M. This is the Ward correspondence, part of 
Penrose's twistor theory. 


The ADHM Construction 


The problem of describing all solutions to the 
Yang-Mills instanton equation over S* is thus 
reduced to a problem in algebraic geometry, of 
classifying certain holomorphic vector bundles. This 
was solved by Atiyah et al. (1978). The resulting 
ADHM construction reduces the problem to certain 
matrix equations. The equations can be reduced to 
the following form. For a bundle Chern class k and 
rank r, we require a pair of k x k matrices 01,05, a 
kxr matrix a, and an r x k matrix b. Then the 
equations are 


lay, 0] = ab 
[ad ,ad | 十 [as a2 | = aa* —b*b 


1] 


We also require certain open, nondegeneracy condi- 
tions. Given such matrix data, a holomorphic 


bundle over CP? is constructed via a “monad”: a 
pair of bundle maps over CP? 


C'e0(-109»Di|»»C*oc*ecesD, 
>> Ct @ O(1) 


with D5D, —0. That is, the rank-r holomorphic 
bundle we construct is Ker D;/Im Dj. The bundle 
maps Di, D2 are obtained from the matrix data in a 
straightforward way, in suitable coordinates. It is 
this matrix description which was used by Boyer 
et al. to prove the Atiyah-Jones conjecture on the 
topology of the moduli spaces of instantons. The 
only other case when the twistor space of a compact 
4-manifold is an algebraic variety is the complex 
projective plane, with the nonstandard orientation. 
An analog of the ADHM description in this case was 
given by Buchdahl (1986). 


Integrable Systems 


The Ward correspondence can be viewed in the general 
framework of integrable systems. Working with the 
standard complex structure on R^, the integrability 
condition for the 0-operator takes the shape 


[Vi 十 1Y3, V3 + IV 4| 一 


where V; are the components of the covariant 
derivative in the coordinate directions. So, the 
instanton equation can be viewed as a family of 
such commutator equations parametrized by A € S?. 
One obtains many reductions of the instanton 
equation by imposing suitable symmetries. Solutions 
invariant under translation in one variable corre- 
spond to the Bogomolny “monopole equation” 
(Jaffe and Taubes 1980). Solutions invariant under 
three translations correspond to solutions of Nahm's 
equations, 
= = €ijk|Tj, Te] 

for matrix-valued functions Ti,T5,T4 of one 
variable ¢. Nahm (1982) and Hitchin (1983) 
developed an analog of the ADHM construction 
relating these two equations. This is now seen as a 
part of a general *Fourier-Mukai-Nahm trans- 
form” (Donaldson and Kronheimer 1990). The 
instanton equations for connections invariant 
under two translations,  Hitchin's equations 
(Hitchin 1983), are locally equivalent to the 
harmonic map equation for a surface into the 
symmetric space dual to the structure group. 
Changing the signature of the metric on R? to (2, 2), 
one gets the harmonic mapping equations into Lie 
groups (Hitchin 1990). More complicated reduc- 
tions yield almost all the known examples of 
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integrable PDEs as special forms of the instanton 
equations (Mason and Woodhouse 1996). 


Moment Maps: the Kobayashi-Hitchin Conjecture 


Let © be a compact Riemann surface. The Jacobian 
of X is the complex torus H!(X,O)/H!(3,Z) it 
parametrizes holomorphic line bundles of degree 0 
over X. The Hodge theory (which was, of course, 
developed long before Hodge in this case) shows 
that the Jacobian can also be identified with the 
torus H'(X,R)/H!(X,Z) which parametrizes flat 
U(1)-connections. That is, any holomorphic line 
bundle of degree 0 admits a unique compatible flat 
unitary connection. 

The generalization of these ideas to bundles of 
higher rank began with Weil. He observed that any 
holomorphic vector bundle of degree 0 admits a flat 
connection, not necessarily unitary. Narasimhan and 
Seshadri (1965) showed that (in the case of degree 0) 
the existence of a flat, irreducible, unitary connec- 
tion was equivalent to an algebro-geometric condi- 
tion of stability which had been introduced shortly 
before by Mumford, for quite different purposes. 
Mumford introduced the stability condition in order 
to construct separated moduli spaces of holo- 
morphic bundles — generalizing the Jacobian - as 
part of his general geometric invariant theory. For 
bundles of nonzero degree, the discussion is slightly 
modified by the use of projectively flat unitary 
connections. The result of Narasimhan and Seshadri 
asserts that there are two different descriptions of 
the same moduli space M" (Sigma): either as 
parametrizing certain irreducible projectively flat 
unitary connections (representations of 74(X)), or 
parametrizing stable holomorphic bundles of degree 
d and rank r. While Narasimhan and Seshadri 
probably did not view the ideas in these terms, 
another formulation of their result is that a certain 
nonlinear PDE for a Hermitian metric on a 
holomorphic bundle — analogous to the Laplace 
equation in the abelian case — has a solution when 
the bundle is stable. 

Atiyah and Bott (1982) cast these results in the 
framework of gauge theory. (The Yang-Mills 
equations in two dimensions essentially reduce to 
the condition that the connection be flat, so they are 
rather trivial locally but have interesting global 
structure.) They made the important observation 
that the curvature of a connection furnishes a map 


F: A Lie(G)’ 


which is an equivariant moment map for the action 
of the gauge group on A. Here the symplectic form 
on the affine space A and the map from the adjoint 
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bundle-valued 2-forms to the dual of the Lie algebra 
of G are both given by integration of products of 
forms. From this point of view, the Narasimhan- 
Seshadri result is an infinite-dimensional example of 
a general principle relating symplectic and complex 
quotients. At about the same time, Hitchin and 
Kobayashi independently proposed an extension of 
these ideas to higher dimensions. Let E be a 
holomorphic bundle over a complex manifold V. 
Any compatible unitary connection on E has 
curvature F of type (1,1). Let w be the (1,1)-form 
corresponding to a fixed Hermitian metric on V. 
The Hermitian Yang-Mills equation is the equation 


Few = pile 


where ju is a constant (determined by the topological 
invariant ci(E)). The Kobayashi-Hitchin conjecture 
is that, when w is Kahler, this equation has an 
irreducible solution if and only if E is a stable 
bundle in the sense of Mumford. Just as in the 
Riemann surface case, this equation can be viewed 
as a nonlinear second-order PDE of Laplace type for 
a metric on E. The moment map picture of Atiyah 
and Bott also extends to this higher-dimensional 
version. In the case when V has complex dimension 
2 (and yp is zero), the Hermitian Yang-Mills 
connections are exactly the anti-self-dual instantons, 
so the conjecture asserts that the moduli spaces of 
instantons can be identified with certain moduli 
spaces of stable holomorphic bundles. 

The Kobayashi-Hitchin conjecture was proved in 
the most general form by Uhlenbeck and Yau 
(1986), and in the case of algebraic manifolds in 
Donaldson (1987). The proofs in Donaldson (1985, 
1987) developed some extra structure surrounding 
these equations, connected with the moment map 
point of view. The equations can be obtained as the 
Euler-Lagrange equations for a nonlocal functional, 
related to the renormalized determinants of Quillen 
and Bismut. The results have been extended to non- 
Káhler manifolds and certain noncompact mani- 
folds. There are also many extensions to equations 
for systems of data comprising a bundle with 
additional structure such as a holomorphic section 
or Higgs' field (Bradlow et al. 1995), or a parabolic 
structure along a divisor. Hitchin's equations 
(Hitchin 1987) are a particularly rich example. 


Topology of Moduli Spaces 


The moduli spaces M, ;(X) of stable holomorphic 
bundles/projectively flat unitary connections over 
Riemann surfaces X have been studied intensively 
from many points of view. They have natural Káhler 
structures: the complex structure being visible in the 


holomorphic bundles guise and the symplectic form 
as the *Marsden- Weinstein quotient" in the unitary 
connections guise. In the case when r and d are 
coprime, they are compact manifolds with compli- 
cated topologies. There is an important basic 
construction for producing cohomology classes 
over these (and other) moduli spaces. One takes a 
universal bundle U over the product M x X with 
Chern classes 


c(U) € H^ (M x X) 


Then, for any class a € H,(X), we get a cohomology 
class c;(U)/a € H*~?(M). Thus, if Rs is the graded 
ring freely generated by such classes, we have a 
homomorphism . v:Ry — H'(M). The questions 
about the topology of the moduli spaces which 
have been studied include: 


1. finding the Betti numbers of the moduli space M; 

2. identifying the kernel of v; 

3. giving an explicit system of generators and 
relations for the ring H*(M); 

4. identifying the Pontrayagin and Chern classes of 
M within H*(M); and 

5. evaluating the pairings 


L v(W) 


for elements W of the appropriate degree in R. 


All of these questions have now been solved quite 
satisfactorily. In early work, Newstead (1967) found 
the Betti numbers in the rank-2 case. The main aim 
of Atiyah and Bott was to apply the ideas of Morse 
theory to the Yang-Mills functional over a Riemann 
surface and they were able to reproduce Newstead’s - 
results in this way and extend them to higher rank. 
They also showed that the map v 1s a surjection, so 
the universal bundle construction gives a system of 
generators for the cohomology. Newstead made 
conjectures on the vanishing of the Pontrayagin 
and Chern classes above a certain range which were 
established by Kirwan and extended to higher rank 
by Earl and Kirwan (1999). Knowing that Rx; maps 
on to H*(M), a full set of relations can (by Poincaré 
duality) be deduced in principle from a knowledge 
of the integral pairings in (5) above, but this is not 
very explicit. A solution to (5) in the case of rank 2 
was found by Thaddeus (1992). He used results 
from the Verlinde theory (see section on 3-manifolds 
below) and the Riemann-Roch formula. Another 
point of view was developed by Witten (1991), who 
showed that the volume of the moduli space was 
related to the theory of torsion in algebraic topology 
and satisfied simple gluing axioms. These different 


points of view are compared in Donaldson (1993). 
Using a nonrigorous localization principle in infinite 
dimensions, Witten (1992) wrote down a general 
formula for the pairings (5) in any rank, and this 
was established rigorously by Jeffrey and Kirwan, 
using a finite-dimensional version of the same 
localization method. A very simple and explicit set 
of generators and relations for the cohomology (in 
the rank-2 case) was given by King and Newstead 
(1998). Finally, the quantum cohomology of the 
moduli space, in the rank-2 case, was identified 
explicitly by Munoz (1999). 


Hyper-Káhler Quotients 


Much of this story about the structure of moduli 
spaces extends to higher dimensions and to the 
moduli spaces of connections and Higgs fields. 
A particularly notable extension of the ideas 
involves hyper-Kahler structures. Let M be a hyper- 
Kahler 4-manifold, so there are three covariant- 
constant self-dual forms w1, w2, w3 on M. These 
correspond to three complex structures H, [2,1 
obeying the algebra of the quaternions. If we single 
out one structure, say 14, the instantons on M can be 
viewed as holomorphic bundles with respect to lj 
satisfying the moment map condition (Hermitian 
Yang-Mills equation) defined by the form wi. 
Taking a different complex structure interchanges 
the role of the moment map and integrability 
conditions. This can be put in a general framework 
of hyper-Kahler quotients due to Hitchin et al. 
(1987). Suppose initially that M is compact 
(so either a K3 surface or a torus). Then the w; 
components of the curvature define three maps 


F; : A — Lie(G)* 


The structures on M make A into a flat hyper- 
Kahler manifold and the three maps F; are the 
moment maps for the gauge group action with 
respect to the three symplectic forms on .A. In this 
situation, it is a general fact that the hyper-Kahler 
quotient — the quotient by G of the common zero set 
of the three moment maps - has a natural hyper- 
Kahler structure. This hyper-Káhler quotient is just 
the moduli space of instantons over M. In the case 
when M is the noncompact manifold R^, the same 
ideas apply except that one has to work with 
the based gauge group Go. The conclusion is that 
the framed moduli spaces M of instantons over R* 
are naturally hyper-Káhler manifolds. One can also 
see this hyper-Kahler structure through the ADHM 
matrix description. A variant of these matrix 
equations was used by Kronheimer to construct 
"gravitational instantons." The same ideas also 
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apply to the moduli spaces of monopoles, where 
the hyper-Kahler metric, in the simplest case, was 
studied by Atiyah and Hitchin (1989). 


Low-Dimensional Topology 
Instantons and 4-Manifolds 


Gauge theory has had unexpected applications in 
low-dimensional topology, particularly the topology 
of smooth 4-manifolds. The first work in this 
direction, in the early 1980s, involved the Yang- 
Mills instantons. The main issue in 4-manifold 
theory at that time was the correspondence between 
the diffeomorphism classification of simply con- 
nected 4-manifolds and the classification up to 
homotopy. The latter is determined by the intersec- 
tion form, a unimodular quadratic form on the 
second integral homology group (i.e., a symmetric 
matrix with integral entries and determinant +1, 
determined up to integral change of basis). The only 
known restriction was that Rohlin's theorem, which 
asserts that if the form is even the signature must be 
divisible by 16. The achievement of the first phase of 
the theory was to show that 


1. There are unimodular forms which satisfy the 
hypotheses of Rohlin's theorem but which do not 
appear as the intersection. forms of smooth 
4-manifolds. In fact, no nonstandard definite 
form, such as a sum of copies of the Eg matrix, 
can arise in this way. 

2. There are simply connected smooth 4-manifolds 
which have isomorphic intersection forms, and 
hence are homotopy equivalent, but which are 
not diffeomorphic. 


These results stand in contrast to the homeomorph- 
ism classification which was obtained by Freedman 
shortly before and which is almost the same as the 
homotopy classification. 

The original proof of item (1) above argued with 
the moduli space M of anti-self-dual instantons SU(2) 
instantons on a bundle with c; — 1 over a simply 
connected Riemannian 4-manifold M with a negative- 
definite intersection form (Donaldson 1983). In the 
model case when M is the 4-sphere the moduli space 
M can be identified explicitly with the open 5-ball. 
Thus the 4-sphere arises as the natural boundary of 
the moduli space. A sequence of points in the moduli 
space converging to a boundary point corresponds to 
a sequence of connections with curvature densities 
converging to a ó-function, as described earlier. One 
shows that in the general case (under our hypotheses 
on the 4-manifold M) the moduli space .M has a 
similar behavior, it contains a collar M x (0,6) 
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formed by instantons made using Taubes’ gluing 
construction, described previously. The complement 
of this collar is compact. In the interior of the moduli 
space, there are a finite number of special points 
corresponding to U(1)-reductions of the bundle P. 
This is the way in which the moduli space “sees” the 
integral structure of the intersection form since such 
reductions correspond to integral homology classes 
with self-intersection —1. Neighborhoods of these 
special points are modeled on quotients C? /U(1); that 
is, cones on copies of CP”. The upshot is that (for 
generic Riemannian metrics on M) the moduli space 
gives a cobordism from the manifold M to a set of 
copies of CP? which can be counted in terms of the 
intersection form, and the result follows easily from 
standard topology. More sophisticated versions of the 
argument extended the results to rule out some 
indefinite intersection forms. 

On the other hand, the original proofs of item (2) 
used “invariants” defined by instanton moduli spaces 
(Donaldson 1990). The general scheme exploits the 
same construction outlined in the previous section. 
We suppose that M is a simply connected 4-manifold 
with b*(M)=1+2p, where p>O is an integer. 
(Here b*(M) is, as usual, the number of positive 
eigenvalues of the intersection matrix.) Ignoring some 
technical restrictions, there is a map 


V: Rm e H'(.M4) 


where Ry is a graded ring freely generated by 
the homology (below the top dimension) of the 
4-manifold M and M is the moduli space of 
anti-self-dual SU(2)-instantons on a bundle with 
C)—k»0. For an element W in Ry of the 
appropriate degree, one obtains a number by 
evaluating, or integrating, v(W) on Mg. The main 
technical difficulty here is that the moduli space M, 
is rarely compact, so one needs to make sense of this 
*evaluation." With all the appropriate technicalities 
in place, these invariants could be shown to 
distinguish various homotopy-equivalent, homeo- 
morphic 4-manifolds. All these early developments 
are described in detail in the book by Donaldson 
and Kronheimer (1990). 


Basic Classes 


Until the early 1990s, these instanton invariants 
could only be calculated in isolated favorable 
cases (although the calculations which were 
made, through the work of many mathematicians, 
led to a large number of further results about 
4-manifold topology). Deeper understanding of 
their structure came with the work of Kronheimer 
and Mrowka. This work was, in large part, 


motivated by a natural question in geometric 
topology. Any homology class a € H2(M;Z) can 
be represented by an embedded, connected, 
smooth surface. One can define an integer g(a) 
to be the minimal genus of such a representative. 
The problem is to find g(a), or at least bounds on 
it. A well-known conjecture, ascribed to Thom, 
was that when M is the complex projective plane 
the minimal genus is realized by a complex curve; 
that is, 


g(+dH) = +(d — 1)(d — 2) 


where H is the standard generator of H2(CP*) and 
d z 1. 

The new geometrical idea introduced by 

Kronheimer and Mrowka was to study instantons 
over a 4-manifold M with singularities along a 
surface X C M. For such connections, there is a real 
parameter: the limit of the trace of the holonomy 
around small circles linking the surface. By 
varying this parameter, they were able to inter- 
polate between moduli spaces of nonsingular 
instantons on different bundles over M and obtain 
relations between the different invariants. They 
also found that if the genus of X is suitably small 
then some of the invariants are forced to vanish, 
thus, conversely, getting information about g for 
4-manifolds with nontrivial invariants. For exam- 
ple, they showed that if M is a K3 surface then 
g(a) — (1/2)(a - a +2). 
— The structural results of Kronheimer and Mrowka 
(1995) introduced the notion of a 4-manifold of 
*simple type." Write the invariant defined above by 
the moduli space M, as Ij: Ru — Q. Then I, 
vanishes except on terms of degree 2d(k), where 
d(k) — 4k — 3(1+ p). We can put all these together 
to define I= > I;: Ry — OQ. The ring Ry is a 
polynomial ring generated by classes a € H2(M), 
which have degree 2 in Ry, and a class X of degree 
4 in Ry, corresponding to the generator of Ho(M). 
The 4-manifold is of simple type if 


I(X*W) = 41(W) 


for all W € Ry. Under this condition, Kronheimer 
and Mrowka showed that all the invariants are 
determined by a finite set of “basic” classes 
Ki,..., K; € H2(M) and rational numbers Bi1,..., 3. 
To express the relation, they form a generating 
function 


Dad ede Ht (3 e) 


This is a priori a formal power series in H^(M) but a 
posteriori the series converges and can be regarded 
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as a function on H5(M). Kronheimer and Mrowka’s 
result is that 


Dy(o) = exp (=) y B,eFr? 
r=] 


It is not known whether all simply connected 
4-manifolds are of simple type, but Kronheimer and 
Mrowka were able to show that this is the case for a 
multitude of examples. They also introduced a 
weaker notion of “finite type,” and this condition 
was shown to hold in general by Munoz and 
Froyshov. The overall result of this work of 
Kronheimer and Mrowka was to make the calcula- 
tion of the instanton invariants for many familiar 
4-manifolds a comparatively straightforward matter. 


3-Manifolds: Casson’s Invariant 


Gauge theory has also entered into 3-manifold 
topology. In 1985, Casson introduced a new 
integer-valued invariant of oriented homology 
3-spheres which “counts” the set Z of equivalence 
classes of irreducible flat SU(2)-connections, or 
equivalently irreducible representations 7,(Y)— 
SU(2). Casson’s approach (Akbulut and McCarthy 
1990) was to use a Heegard splitting of a 
3-manifold Y into two handle bodies Y*, Y^ with 
a surface X as common boundary. Then 7,(X) maps 
onto 7z,(Y) and a flat SU(2) connection on Y is 
determined by its restriction to X. Let My be the 
moduli space of irreducible flat connections over X 
(as discussed in the last section) and let L^ C My be 
the subsets which extend over Y^. Then L* are 
submanifolds of half the dimension of Ms and the 
set Z can be identified with the intersection L* N L~. 
The Casson invariant is one-half the algebraic 
intersection number of L* and L~. Casson showed 
that this is independent of the Heegard splitting 
(and is also, in fact, an integer, although this is not 
obvious). He showed that when Y is changed by 
Dehn surgery along a knot, the invariant changes 
by a term computed from the Alexander polynomial 
of the knot. This makes the Casson invariant 
computable in examples. (For a discussion of 
Casson's formula see Donaldson (1999).) Taubes 
showed that the Casson invariant could also be 
obtained in a more differential-geometric fashion, 
analogous to the instanton invariants of 4-manifolds 


(Taubes 1990). 


3-Manifolds: Floer Theory 


Independently, at about the same time, Floer 
(1989) introduced more sophisticated invariants — 
the Floer homology groups - of homology 
3-spheres, using gauge theory. This development 


ran parallel to his introduction of similar ideas in 
symplectic geometry. Suppose, for simplicity, that 
the set Z of equivalence classes of irreducible flat 
connections is finite. For pairs p_,p, in Z, Floer 
considered the instantons on the tube Y x R 
asymptotic to p* at coc. There is an infinite set 
of moduli spaces of such instantons, labeled by a 
relative Chern class, but the dimensions of these 
moduli spaces agree modulo 8. This gives a relative 
index ó(p.,p,) € Z/8. If ó(p.,p,)—1 there is a 
moduli space of dimension 1 (possibly empty), but 
the translations of the tube act on this moduli space 
and, dividing by translations, we get a finite set. 
The number of points in this set, counted with 
suitable signs, gives an integer z(p ,p,). Then, 
Floer considers the free abelian groups 


C, = (D Z(p) 


pez 


generated by the set Z and a map 9:C,— C, 
defined by 


O((o-)) = 》 n(p-, px) (p+) 


Here the sum runs over the p, with ó(p ,p,)— 1l. 
Floer showed that 0*=0 and the homology 
HF,(Y)—kerO/ImÓ is independent of the metric 
on Y (and various other choices made in implement- 
ing the construction in detail). The chain complex 
C, and hence the Floer homology can be graded by 
Z/8, using the relative index, so the upshot is to 
define 8 abelian groups HF;(Y): invariants of the 
3-manifold Y. The Casson invariant appears now as 
the Euler characteristic of the Floer homology. 
There has been extensive work on extending these 
ideas to other 3-manifolds (not homology spheres) 
and gauge groups, but this line of research does not 
yet seem to have reached a clear-cut conclusion. 

Part of the motivation for Floer's work came from 
Morse theory, and particularly the approach to 
this theory expounded by Witten (1982). The 
Chern-Simons functional is a map 


CS: A/G — R/Z 


from the space of SU(2)-connections over Y. 
Explicitly, in a trivialization of the bundle 


CS(A) = /4AAdA+3AAAAA 
Y 


It appears as a boundary term in the Chern-Weil 
theory for the second Chern class, in a similar way 
as holonomy appears as a boundary term in the 
Gauss-Bonnet theorem. The set Z can be identified 
with the critical points of CS and the instantons on 
the tube as integral curves of the gradient vector 
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field of CS. Floer’s definition mimics the definition 
of homology in ordinary Morse theory, taking 
Witten’s point of view. It can be regarded formally 
as the ‘“middle-dimensional” homology of the 
infinite-dimensional space A/G. See Atiyah (1988) 
and Cohen et al. (1995) for discussions of these 
ideas. 

The Floer theory interacts with 4-manifold invar- 
iants, making up a structure approximating to 
a (3--1)-dimensional topological field theory 
(Atiyah 1988). Roughly, the numerical invariants 
of closed 4-manifolds generalize to invariants for a 
4-manifold M with boundary Y taking values in the 
Floer homology of Y. If two such manifolds are 
glued along a common boundary, the invariants of 
the result are obtained by a pairing in the Floer 
groups. There are, however, at the moment, some 
substantial technical restrictions on this picture. This 
theory, as well as Floer's original construction, is 
developed in detail by Donaldson (2002). At the time 
of writing, the Floer homology groups are still difficult 
to compute in examples. One important tool is a 
surgery-exact sequence found by Floer (Braam and 
Donaldson 1995), related to  Casson's surgery 
formula. 


3-Manifolds: Jones-Witten Theory 


There is another, quite different, way in which ideas 
from gauge theory have entered 3-manifold topo- 
logy. This is the Jones- Witten theory of knot and 
3-manifold invariants. This theory falls outside the 
main line of this article, but we will say a little about 
it since it draws on many of the ideas we have 
discussed. The goal of the theory is to construct a 
family of (2 + 1)-dimensional topological field the- 
ories indexed by an integer k, assigning complex 
vector space H,(X) to a surface X and an invariant 
in H,(0Y) to a 3-manifold-with-boundary Y. If OY is 
empty, the vector space H,(OY) is taken to be C, so 
one seeks numerical invariants of closed 3-manifolds. 
Witten's (1989) idea is that these invariants of closed 
3-manifolds are Feynmann integrals 


f ei2mkCS(A) DĄ 
A/G 


This functional integral is probably a schematic 
rather than a rigorous notion. The data associated 
with surfaces can, however, be defined rigorously. If 
we fix a complex structure I on X, we can define a 
vector space H,(X, I) to be 


H,(=, D = HW (M(3); L*) 


where M(X) is the moduli space of stable holo- 
morphic bundles/flat unitary connections over X 
and L is a certain holomorphic line bundle over 


M(X). These are the spaces of “conformal blocks” 
whose dimension is given by the Verlinde formulas. 
Recall that .M(X), as a symplectic manifold, is 
canonically associated with the surface X, without 
any choice of complex structure. The Hilbert 
spaces H,(,1) can be regarded as the quantization 
of this symplectic manifold, in the general frame- 
work of geometric quantization: the inverse of k 
plays the role of Planck’s constant. What is not 
obvious is that this quantization is independent of 
the complex structure chosen on the Riemann 
surface: that is, that there is a natural identification 
of the vector spaces (or at least the associated 
projective spaces) formed by using different com- 
plex structures. This was established rigorously by 
Hitchin (1990) and Axelrod et al. (1991), who 
constructed a projectively flat connection on the 
bundle of spaces H,(X, I) over the space of complex 
structures I on X. At a formal level, these 
constructions are derived from the construction of 
the metaplectic representation of a linear symplec- 
tic group, since the My are symplectic quotients of 
an affine symplectic space. 

The Jones-Witten invariants have been rigorously 
established by indirect means, but it seems that there 
is still work to be done in developing Witten's point 
of view. If Y* is a 3-manifold with boundary, one 
would like to have a geometric definition of a vector 
in H,(OY*). This should be the quantized version of 
the submanifold L* (which is Lagrangian in My) 
entering into the Casson theory. 


Seiberg-Witten Invariants 


The instanton invariants of a 4-manifold can be 
regarded as the integrals of certain natural differ- 
ential forms over the moduli spaces of instantons. 
Witten (1988) showed that these invariants could be 
obtained as functional integrals, involving a variant 
of the Feynman integral, over the space of connec- 
tions and certain auxiliary fields (insofar as this 
latter integral is defined at all). A geometric 
explanation of Witten's construction was given by 
Atiyah and Jeffrey (1990). Developing this point of 
view, Witten made a series of predictions about the 
instanton invariants, many of which were subse- 
quently verified by other means. This line of work 
culminated in 1994 where, applying developments 
in supersymmetric Yang-Mills QFT, Seiberg and 
Witten introduced a new system of invariants and a 
precise prediction as to how these should be related 
to the earlier ones. 

The Seiberg- Witten invariants (Witten 1994) are 
associated with a Spin‘ structure on a 4-manifold M. 
If M is simply connected this is specified by a class 
K € H?(M; Z) lifting w2(M). One has spin bundles 
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St, S-—M with c,(S*)=K. The Seiberg-Witten 
equation is for a spinor field ó — a section of S* 
and a connection A on the complex line bundle 
A?S*. This gives a connection on $^ and hence a 
Dirac operator 


Da :T(St) 2 I($^) 
The Seiberg- Witten equations are 
Dao = 0, Fi = o(¢) 


where co: 一 人 AT is a certain natural quadratic 
map. The crucial differential-geometric feature of 
these equations arises from the Weitzenbock 
formula 


R 
D';DAó = V4VAQ + 4? + p(F*)ó 


where R is the scalar curvature and p is a natural 
map from A* to the endomorphisms of $^. Then p is 
adjoint to o and 


(p(o(¢))¢, à) = |o 


It follows easily from this that the moduli space of 
solutions to the Seiberg- Witten equation is compact. 
The most important invariants arise when K is 
chosen so that 


K-K = 2x(M) + 3sign(M) 


where x(M) is the Euler characteristic and sign(M) is 
the signature. (This is just the condition for K to 
correspond to an almost-complex structure on M.) In 
this case, the moduli space of solutions is zero 
dimensional (after generic perturbation) and the 
Seiberg- Witten invariant SW(K) is the number of 
points in the moduli space, counted with suitable signs. 

Witten's conjecture relating the invariants, in its 
simplest form, is that when M has simple type the 
classes K for which SW(K) is nonzero are exactly the 
basic classes K, of Kronheimer and Mrowka and that 


B, = 2€" SW(K,) 


where C(M)=2 + (1/4)(7x(M) + 11 sign(M)). This 
asserts that the two sets of invariants contain exactly 
the same information about the 4-manifold. 

The evidence for this conjecture, via calculations of 
examples, is very strong. A somewhat weaker 
statement has been proved rigorously by Feehan and 
Leness (2003). They use an approach suggested by 
Pidstragatch and Tyurin, studying moduli spaces of 
solutions to a nonabelian version of the Seiberg- 
Witten equations. These contain both the instanton 
and abelian Seiberg- Witten moduli spaces, and the 
strategy is to relate the topology of these two sets by 
standard localization arguments. (This approach is 
related to ideas introduced by Thaddeus (1994) in the 


case of bundles over Riemann surfaces.) The serious 
technical difficulty in this approach stems from the 
lack of compactness of the nonabelian moduli spaces. 
The more general versions of Witten's conjecture 
(Moore and Witten 1997) (e.g., when b*(M)- 1) 
contain very complicated formulas, involving mod- 
ular forms, which presumably arise as contributions 
from the compactification of the moduli spaces. 


Applications 


Regardless of the connection with the instanton 
theory, one can go ahead directly to apply the 
Seiberg-Witten invariants to 4-manifold topology, 
and this has been the main direction of research 
since the 1990s. The features of the Seiberg-Witten 
theory which have led to the most prominent 
developments are the following. 


1. The reduction of the equations to two dimensions 
is very easy to understand. This has led to proofs 
of the Thom conjecture and wide-ranging gen- 
eralizations (Ozsvath and Szabo 2000). 

2. The Weitzenboch formula implies that, if M has 
positive scalar curvature, then solutions to the 
Seiberg- Witten equations must have ¢=0. This 
has led to important interactions with four- 
dimensional Riemannian geometry (Lebrun 1996). 

3. In the case when M is a symplectic manifold, 
there is a natural deformation of the Seiberg- 
Witten equations, discovered by Taubes (1996), 
who used it to show that the Seiberg-Witten 
invariants of M are nontrivial. More generally, 
Taubes showed that for large values of the 
deformation parameter the solutions of the 
deformed equation localize around surfaces in 
the 4-manifold and used this to relate the 
Seiberg-Witten invariants to the Gromov theory 
of pseudoholomorphic curves. These results of 
Taubes have completely transformed the subject 
of four-dimensional symplectic geometry. 


Bauer and Furuta (2004) have combined the 
Seiberg-Witten theory with more sophisticated 
algebraic topology to obtain further results about 
4-manifolds. They consider the map from the space of 
connections and spinor fields defined by the formulas 
on the left-hand side of the equations. The general 
idea is to obtain invariants from the homotopy class 
of this map, under a suitable notion of homotopy. 
A technical complication arises from the gauge group 
action, but this can be reduced to the action of a single 
U(1). Ignoring this issue, Bauer and Furuta have 
obtained invariants in the stable homotopy groups 
limy æ TNar(S%), which reduce to the ordinary 
numerical invariants when r— 1. Using these invar- 
iants, they obtain results about connected sums of 
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4-manifolds, for which the ordinary invariants are 
trivial. Using refined cobordism invariants ideas, 
Furuta made great progress towards resolving the 
question of which intersection forms arise from 
smooth, simply connected 4-manifolds. A well- 
known conjecture is that, if such a manifold is spin, 
then the second Betti number satisfies 


ba(M) > ¥ lsign(M)| 


Furuta (2001) proved that b2(M) > (10/8)|sign( M)| +2. 

An important and very recent achievement, bringing 
together many different lines of work, is the proof of 
“Property P” in 3-manifold topology by Kronheimer 
and Mrowka (2004). This asserts that one cannot 
obtain a homotopy sphere (counter-example to the 
Poincaré conjecture) by +l-surgery along a nontrivial 
knot in $?. The proof uses work of Gabai and 
Eliashberg to show that the manifold obtained by 
O-framed surgery is embedded in a symplectic 
4-manifold; Taubes' results to show that the Seiberg- 
Witten invariants of this 4-manifold are nontrivial; 
Feehan and Leness’ partial proof of Witten's con- 
jecture to show that the same is true for the instanton 
invariants; and the gluing rule and Floer's exact 
sequence to show that the Floer homology of the 
+1-surgered manifold is nontrivial. It follows then 
from the definition of Floer homology that the funda- 
mental group of this manifold is not trivial; in fact, 
it must have an irreducible representation in SU(2). 


See also: Cotangent Bundle Reduction; Floer Homology; 
Gauge Theories from Strings; Gauge Theoretic Invariants 
of 4-Manifolds; Instantons: Topological Aspects; 

Knot Homologies; Moduli Spaces: An Introduction; 
Nonperturbative and Topological Aspects of Gauge 
Theory; Seiberg—Witten Theory; Topological Quantum 
Field Theory: Overview; Variational Techniques for 
Ginzburg—Landau Energies. 
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Introduction 


Einstein's general theory of relativity has become the 
foundation for our understanding of the gravita- 
tional interaction. Four decades of high-precision 


experiments have verified the theory with ever- 
increasing precision, with no confirmed evidence of 
a deviation from its predictions. The theory is now 
the standard framework for much of astronomy, 
with its searches for black holes, neutron stars, 
gravitational waves, and the origin and fate of the 
universe. 

Yet modern developments in particle theory 
suggest that it may not be the entire story, and that 


482 General Relativity: Experimental Tests 


modification of the basic theory may be required at 
some level. String theory generally predicts a 
proliferation of gravity-like fields that could result 
in alterations of general relativity (GR) reminiscent 
of the Brans—Dicke theory of the 1960s. In the 
presence of extra dimensions, the gravity of the four- 
dimensional “brane” of a higher-dimensional world 
could be somewhat different from a pure four- 
dimensional GR. However, any theoretical specula- 
tion along these lines must still abide by the best 
current empirical bounds. This article will review 
experimental tests of GR and the theoretical 
implications of the results. 


The Einstein Equivalence Principle 


The Einstein equivalence principle is a modern 
generalization of Einstein’s 1907 idea of an equiva- 
lence between gravity and acceleration, or between 
free fall and an absence of gravity. It states that: 
(1) test bodies fall with the same acceleration 
independently of their internal structure or composi- 
tion (weak equivalence principle, or WEP); (2) the 
outcome of any local nongravitational experiment is 
independent of the velocity of the freely falling 
reference frame in which it is performed (local 
Lorentz invariance, or LLI); and (3) the outcome of 
any local nongravitational experiment is indepen- 
dent of where and when in the universe it is 
performed (local position invariance, or LPI). 

This principle is fundamental to gravitational 
theory, for it is possible to argue that, if EEP is 
valid, then gravitation and geometry are synon- 
ymous. In other words, gravity must be described by 
a *metric theory of gravity," in which (1) spacetime 
is endowed with a symmetric metric, (2) the 
trajectories of freely falling bodies are geodesics of 
that metric, and (3) in local freely falling reference 
frames, the nongravitational laws of physics are 
those written in the language of special relativity 
(see Will (1993) for further details). 

GR is a metric theory of gravity, but so are many 
others, including the scalar-tensor theory of Brans 
and Dicke and many of its modern descendents, 
some of which are inspired by string theory. 


Tests of the Weak Equivalence Principle 


To test the WEP, one compares the acceleration of 
two laboratory-sized bodies of different composition 
in an external gravitational field. Although legend 
suggests that Galileo may have demonstrated this 
principle to his students at the Leaning Tower of Pisa, 
and Newton tested it by means of pendulum 
experiments, the first true high-precision experiments 


were done at the end of the nineteenth century by the 
Hungarian physicist Baron Roland von Eótvós and 
colleagues. 

Eötvös employed a torsion balance, in which 
(schematically) two bodies of different composition 
are suspended at the ends of a rod that is supported 
horizontally by a fine wire or fiber. One then looks 
for a difference in the horizontal accelerations of the 
two bodies as revealed by a slight rotation of the 
rod. The source of the horizontal gravitational force 
could be the Sun, a large mass in or near the 
laboratory, or, as Eótvós recognized, the Earth itself. 
A measurement or limit on the fractional difference 
in acceleration between two bodies yields a quantity 
n = 2\a, — a2|/\a; +-a2|, called the “Eötvös ratio." 
Eótvós' experiments showed that 7 was smaller than 
a few parts in 10?, and later classic experiments in 
the 1960s and 1970s by Dicke and Braginsky 
improved the bounds by several orders of magni- 
tude. Additional experiments were carried out 
during the 1980s as part of a search for a putative 
“fifth force," that was motivated in part by a 
re-analysis of Eótvós' original data. 

The best limit on 7 currently comes from experi- 
ments carried out during the 1985-2000 period at 
the University of Washington (called the *Eót- 
Wash" experiments), which used a sophisticated 
torsion balance tray to compare the accelerations of 
bodies of different composition toward the Earth, 
the Sun, and the galaxy. Another strong bound 
comes from ongoing laser ranging to reflectors 
deposited on the Moon during the Apollo program 
in the 1970s (lunar laser ranging, LLR), which 
routinely determines the Earth-Moon distance to 
millimeter accuracies. The data may be used to 
check the equality of acceleration of the Earth and 
Moon toward the Sun. The results from laboratory 
and LLR experiments are (Will 2001): 

as < 4% 10 ^5, mig «5x10^P? [1] 
LLR also shows that gravitational binding energy 
falls with the same acceleration as ordinary matter to 
1.3 x 10^? (test of the Nordtvedt effect — see the section 
*Bounds on the PPN parameters" and Table 1). 

Many of the high-precision, low-noise methods that 
were developed for tests of WEP have been adapted 
to laboratory tests of the inverse-square law of 
Newtonian gravitation at millimeter scales and 
below. The goal of these experiments is to search for 
additional gravitational interactions involving massive 
particles or for the presence of large extra dimensions. 
The challenge of these experiments is to distinguish 
gravitation-like interactions from electromagnetic and 
quantum-mechanical effects. No deviations from 


Table 1 Current limits on the PPN parameters 


General Relativity: Experimental Tests 483 


Parameter Effect Limit Remarks 
+ —1 (i) Shapiro delay 2.3x10^? Cassini tracking 
(ii) Light deflection 4 x 107^ VLBI 
B — 1 (i) Perihelion shift 3 x 1073 Jo — 10^ from helioseismology 
(ii) Nordtvedt effect 2.3x 10-4 LLR plus bounds on other parameters 
€ Anisotropy in Newton's G 10? Gravimeter bounds on anomalous Earth tides 
04 Orbit polarization for moving systems 107^ Lunar laser ranging 
Qo Anomalous spin precession for moving bodies 4 x 1077 Alignment of solar axis relative to ecliptic 
ag Anomalous self-acceleration for spinning moving bodies 2x 10? Pulsar spindown timing data 
n? Nordtvedt effect '9 x 107^ Lunar laser ranging 
G 2 x 107? Combined PPN bounds 
Co Anomalous self-acceleration for binary systems 4 x 107? Timing data for PSR 1913+ 16 
C3 Violation of Newton's third law 10-8 Lunar laser ranging 
C4 Not independent 


“Here n=48 — y — 3 — 10/3 — ay + 202/83 — 20/3 — (2/3. 


Newton's inverse-square law have been found to date 
at distances between 10 um and 10 mm. 


Tests of Local Lorentz Invariance 


Although special relativity itself never benefited 
from the kind of “crucial” experiments, such as the 
perihelion advance of Mercury and the deflection of 
light, that contributed so much to the initial 
acceptance of GR and to the fame of Einstein, the 
steady accumulation of experimental support, 
together with the successful integration of special 
relativity into quantum mechanics, led to its being 
accepted by mainstream physicists by the late 1920s, 
ultimately to become part of the standard toolkit of 
every working physicist. 

But in recent years new experiments have placed 
very tight bounds on any violations of the Lorentz 
invariance, which underlies special relativity. A 
simple way of interpreting this new class of 
experiments is to suppose that a coupling of some 
external gravitation-like field (not the metric) to the 
electromagnetic interactions results in-an effective 
change in the speed of electromagnetic radiation, c, 
relative to the limiting speed of material test 
particles, co; in other words, c Æ co. It can be 
shown that such a Lorentz-noninvariant electromag- 
netic interaction would cause shifts in the energy 
levels of atoms and nuclei that depend on the 
orientation of the quantization axis of the state 
relative to our velocity relative to the rest of the 
universe, and on the quantum numbers of the state, 
resulting in orientation dependences of the funda- 
mental frequencies of such atomic clocks. The 
magnitude of these *clock anisotropies" would be 
proportional to 6 = |(co/c)? — 1|, which vanishes if 
Lorentz invariance holds (see Will (1993) and 
Haugan and Will (1987) for details). 


The earliest clock anisotropy experiments were 
carried out around 1960 independently by Hughes 
and Drever, although their original motivation was 
somewhat different. Dramatic improvements were 
made in the 1980s using laser-cooled trapped atoms 
and ions. This technique made it possible to reduce 
the broadening of resonance lines caused by colli- 
sions, leading to the impressive bound |6| > 107% 
(Will 2001). 

Other recent tests of Lorentz invariance violation 
include comparisons of resonant cavities with 
atomic clocks, tests of dispersion and birefringence 
in the propagation of high-energy photons from 
astrophysical sources, threshold effects in elemen- 
tary particle collisions, and anomalies in neutrino 
oscillations. Mattingly (2005) gives a thorough and 
up-to-date review of both the theoretical frame- 
works for studying these effects and the experimen- 
tal results. 


Tests of Local Position Invariance 


LPI requires, among other things, that the internal 
binding energies of atoms and nuclei be indepen- 
dent of location in space and time, when measured 
against some standard atom. This means that a 
comparison of the rates of two different kinds of 
atomic clocks should be independent of location or 
epoch, and that the frequency shift between two 
identical clocks at different locations is simply a 
consequence of the apparent Doppler shift 
between a pair of inertial frames momentarily 
comoving with the clocks at the moments of 
emission and reception, respectively. The relevant 
parameter o appears in the formula for the 
frequency shift, 


Af f = (1- a)A®/c [2] 
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where ® is the Newtonian gravitational potential. If 
LPI holds, a=0. An early test of this was the 
Pound-Rebka experiment of 1960, which measured 
the frequency shift of gamma rays from radioactive 
iron nuclei in a tower at Harvard University. The 
best bounds come from a 1976 experiment in which 
a hydrogen maser atomic clock was launched to 
10000km altitude on a Scout rocket and its 
frequency compared via telemetry with an identical 
clock on the ground, and a 1993 experiment in 
which two different kinds of atomic clocks were 
intercompared as a function of the varying solar 
gravitational field as seen on Earth (a “null” redshift 
experiment). The results are (Will 2001): 


OMaser < 2 X w”, ONull < Lo [3] 


Recent “clock comparison” tests of LPI include 
experiments done at the National Institute of 
Standards and Technology (NIST) in Boulder and 
at the Observatory of Paris, to look for cosmological 
variations in clock rates. The NIST experiment 
compared laser-cooled mercury ions with neutral 
cesium atoms over a two-year period, while the 
Paris experiment compared laser-cooled cesium and 
rubidium atomic fountains over five years; the 
results showed that the fine-structure constant is 
constant in time to a part in 10" per year. A better 
bound of 6 x 107" yr^! comes from analysis of 
fission yields of the Oklo natural reactor, which 
occurred in Africa two billion years ago. 


Solar-System Tests 
The Parametrized Post-Newtonian Framework 


It was once customary to discuss experimental tests of 
GR in terms of the “three classical tests," the 
gravitational redshift (which is really a test of the 
EEP, not of GR itself; see the section on tests of LPI), 
the perihelion advance of Mercury (the first success of 
the theory), and the deflection of light (whose 
measurement in 1919 made Einstein a celebrity). 
However, the proliferation of additional experimental 
tests and of well-motivated alternative metric theories 
of gravity made it desirable to develop a more general 
theoretical framework for analyzing both experiments 
and theories. This “parametrized post-Newtonian 
(PPN) framework” dates back to Eddington in 1922, 
but was fully developed by Nordtvedt and Will in the 
period 1968-72 (see Will (1993) for details). 

When attention is confined to metric theories of 
gravity and, further, the focus is on the slow-motion, 
weak-field limit appropriate to the solar system and 
similar systems, it turns out that, in a broad class of 
metric theories, only the numerical values of a set of 


coefficients in the spacetime metric vary from theory to 
theory. The resulting PPN framework contains ten 
parameters: y, related to the amount of spatial curv- 
ature generated by mass; (3, related to the degree of 
nonlinearity in the gravitational field; £, o1, a2, and 
a3, which determine whether the theory violates LPI 
or LLI in gravitational experiments; and 61, (2, 63, and 
C4, which describe whether the theory has appropriate 
momentum conservation laws. In GR, y=1, 8— 1, 
and the remaining parameters all vanish. In the scalar- 
tensor theory of Brans-Dicke, y= (1 + wgp)/(2 4- wpp), 
where wpp is an adjustable parameter. 

A number of well-known relativistic effects can be 
expressed in terms of these PPN parameters: 


Deflection of light 


T e 21 4GM 


2 dc? 


7 1 + ^y Re 
= (=) x 1.750577 arcsec [4] 


where d is the distance of closest approach of a ray 
of light to a body of mass M, and where the second 
line is the deflection by the Sun, with radius Ro. 


Shapiro time delay 


Ad = ( =) s e: TX -n)(r2 —x2-n) 


2 e d? 5] 
where At is the excess travel time of a round-trip 
electromagnetic tracking signal, x; and x? are the 
locations relative to the body of mass M of the 
emitter and receiver of the round-trip signal (r; and 
r2 are the respective distances), and n is the direction 
of the outgoing tracking signal. 


Perihelion advance 


dw | (Ha GM 
dt 3 Pa(1 — e*)c? 


a (H) x 42.98 arcsec/100yr [6] 


where P, a, and e are the period, semimajor axis, 
and eccentricity of the planet’s orbit, respectively; 
the second line is the value for Mercury. 


Nordtvedt effect 


mG — m 
z 2 (48-5-3-3£- oi +30 
Esl 
= 7 


where mg and mr are, respectively, the gravitational 
and inertial masses of a body such as the Earth or 


Moon, and E, is its gravitational binding energy. A 
nonzero Nordtvedt effect would cause the Earth and 
Moon to fall with a different acceleration toward 
the Sun. In GR, this effect vanishes. 


Precession of a gyroscope 


d$ 
— = (XQrp + Oeo) x S 


dt 
QFp = (1 +7+2) 350 - 3nn - J) 
= ; (1 +y +) x 0.041 arcsec yr | 
(css, = ijt T 2y)v x pie 
= ;ü + 27) x 6.6arcsec yr ! [8] 


where S is the spin of the gyroscope, and Qrp and 
Geo are, respectively, the precession angular velo- 
cities caused by the dragging of inertial frames 
(Lense-Thirring effect) and by the geodetic effect, a 
combination of Thomas precession and precession 
induced by spatial curvature; J is the angular 
momentum of the Earth, and v, n, and r are, 
respectively, the velocity, direction, and distance of 
the gyroscope. The second line in each case is the 
corresponding value for a gyroscope in polar Earth 
orbit at about 650 km altitude (Gravity Probe B). 


Bounds on the PPN Parameters 


Four decades of high-precision experiments, ranging 
from the standard light-deflection and perihelion- 
shift tests, to LLR, planetary and satellite tracking 
tests of the Shapiro time delay, and geophysical and 
astronomical observations, have placed bounds on 
the PPN parameters that are consistent with GR. 
The current bounds are summarized in Table 1 (Will 
2001). 

To illustrate the dramatic progress of experimen- 
tal gravity since the dawn of Einstein's theory, 
Figure 1 shows a history of results for (1 + y)/2, 
from the 1919 solar eclipse measurements of 
Eddington and his colleagues (which made Einstein 
a celebrity), to modern-day measurements using very 
long baseline radio interferometry (VLBI), advanced 
radar tracking of spacecraft, and the astrometry 
satellite Hipparcos. The most recent results include a 
2003 measurement of the Shapiro delay, performed 
by tracking the “Cassini” spacecraft on its way 
to Saturn, and a 2004 measurement of the bending 
of light via analysis of VLBI data on 541 quasars 
and compact radio galaxies distributed over the 
entire sky. 
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Figure 1 Measurements of the coefficient (1 二 7)/2 from 
observations of the deflection of light and of the Shapiro delay 
in propagation of radio signals near the Sun. The GR prediction 
is unity. “Optical” denotes measurements of stellar deflection 
made during solar eclipse, and "Radio" denotes interferometric 
measurements of radio-wave deflection. "Hipparcos" denotes the 
European optical astrometry satellite. Arrows denote values well 
off the chart from one of the 1919 eclipse expeditions and from 
others through 1947. Shapiro delay measurements using the 
Cassini spacecraft on its way to Saturn yielded tests at the 
0.00175 level, and light deflection measurements using VLBI 
have reached 0.0275. 


The perihelion advance of Mercury, the first of 
Einstein’s successes, is now known to agree with 
observation to a few parts in 10?. During the 1960s 
there was controversy about this test when reports of an 
excess solar oblateness implied an unacceptably large 
Newtonian contribution to the perihelion advance. 
However, it is now known from helioseismology, the 
study of short-period vibrations of the Sun, that the 
oblateness is of the order of a part in 10’, as expected 
from standard solar models, much too small to affect 
Mercury's orbit, within the observational errors. 


Gravity Probe B 


The NASA Relativity Mission called Gravity Probe B 
(GPB) recently completed its mission to measure the 
Lense-Thirring and geodetic precessions of gyroscopes 
in Earth's orbit. Launched on 20 April 2004 for a 
16-month mission, it consisted of four spherical rotors 
coated with a thin layer of superconducting niobium, 
spinning at 70—100 Hz, in a spacecraft filled with liquid 
helium, containing a telescope continuously pointed 
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toward a distant guide star (IM Pegasi). Superconduct- 
ing current loops encircling each rotor were designed to 
measure the change in direction of the rotors by 
detecting the change in magnetic flux through the loop 
generated by the London magnetic moment of the 
spinning superconducting film. The spacecraft was in a 
polar orbit at 650 km altitude. The primary science goal 
of GPB was a 1% measurement of the 41 marcsec yr” | 
frame dragging or Lense-Thirring effect caused by the 
rotation of the Earth; its secondary goal was to measure 
to six parts in 10? the larger 6.6 arcsec yr! geodetic 
precession caused by space curvature. 


The Binary Pulsar 


The binary pulsar PSR 1913 + 16, discovered in 1974, 
provided important new tests of GR. The pulsar, with 
a pulse period of 59 ms, was observed to be in orbit 
about an unseen companion (now generally thought to 
be a dead pulsar), with a period of ~8h. Through 
precise timing of apparent variations in the pulsar 
“clock” caused by the Doppler effect, the important 
orbital parameters of the system could be measured 
with exquisite precision. These included nonrelativistic 
*Keplerian" parameters, such as the eccentricity e, and 
the orbital period (at a chosen epoch) P,, as well as a 
set of relativistic *post-Keplerian" (PK) parameters. 
The first PK parameter, (2), is the mean rate of 
advance of periastron, the analog of Mercury's 
perihelion shift. The second, denoted 7’, is the effect 
of special relativistic time dilation and the gravita- 
tional redshift on the observed phase or arrival time of 
pulses, resulting from the pulsar's orbital motion and 
the gravitational potential of its companion. The third, 
P\,, is the rate of decrease of the orbital period; this is 
taken to be the result of gravitational radiation 
damping (apart from a small correction due to the 
acceleration of the system in our rotating galaxy). Two 
other parameters, s and r, are related to the Shapiro 
time delay of the pulsar signal if the orbital inclination 
is such that the signal passes in the vicinity of the 
companion; s is a direct measure of the orbital 


inclination sin 7. According to GR, the first three PK 
effects depend only on e and Pp, which are known, and 
on the two stellar masses, which are unknown. By 
combining the observations of PSR 1913 + 16 (see 
Table 2) with the GR predictions, one obtains both a 
measurement of the two masses and a test of GR, since 
the system is overdetermined. The results are 


mı = 1.4414 + 0.0002M.., 


“GR ,; OBS 
Py /Ps 


my = 1.3867 + 0.0002M. 
= 1.0013 + 0.0021 (9] 


Other relativistic binary pulsars may provide even 
more stringent tests. These include the relativistic 
neutron star/white dwarf binary pulsar J1141-6545, 
with a 0.19 day orbital period, which may ultimately 
lead to a very strong bound on the phenomenon of 
dipole gravitational radiation, predicted by many 
alternative theories of gravity, but not by GR; and 
the remarkable “double pulsar” ]0737-3039, a 
binary system with two detected pulsars, in a 
0.10 day orbit seen almost edge on and a periastron 
advance of 17^ per year. For further discussion of 
binary pulsar tests, see Stairs (2003). 


Gravitational-Wave Tests 


The detection of gravitational radiation by either 
laser interferometers or resonant cryogenic bars will 
usher in a new era of gravitational-wave astronomy 
(Barish and Weiss 1999). Furthermore, it will yield 
new and interesting tests of GR in its radiative 
regime (Will 1999). 

GR predicts that gravitational waves possess only two 
polarization modes independently of the source; they are 
transverse to the direction of propagation and quad- 
rupolar in their effect on a detector. Other theories of 
gravity may predict up to four additional modes of 
polarization. A suitable array of gravitational antennas 
could delineate or limit the number of modes present in a 
given wave. If distinct evidence were found of any mode 
other than the two transverse quadrupolar modes of 
GR, the result would be disastrous for the theory. 


Table 2 Parameters of the binary pulsars PSR 1913 + 16 and J0737-3039 


Parameter Symbol 
Keplerian parameters 

Eccentricity e 

Orbital period Ps (day) 
Post-Keplerian parameters 

Periastron advance (o) yr") 
Redshift/time dilation y (ms) 
Orbital period derivative P,(10-12) 
Shapiro delay (sin /) S 


“Numbers in parentheses denote errors in last digit. 


Value? in PSR1913 + 16 Value? in J0737-3039 


0.6171338(4) 0.087779(5) 
0.322997448930(4) 0.102251563(1) 
4.226595(5) 16.90(1) 
4.2919(8) 0.382(5) 
—2.4184(9) 

0.9995(4) 


According to GR, gravitational waves propagate with 
the same speed, c, as light. In other theories, the speed 
could differ from c because of coupling of gravitation to 
“background” gravitational fields, or propagation of the 
waves into additional spatial dimensions. Another way 
in which the speed of gravitational waves could differ 
from c is if gravitation were propagated by a massive 
field (a massive graviton), in which case vg would be 
given by, in a local inertial frame, 


2 2 .4 
a edel up [10] 
e E PX 


where m, E, and f are the graviton rest mass, 
energy, and frequency, respectively, and A; =h/mgc 
is the graviton Compton wavelength (it is assumed 
that A, > c/f). 

The most obvious way to measure the speed of 
gravitational waves is to compare the arrival times of a 
gravitational wave and an electromagnetic wave from 
the same event (e.g., a supernova). For a source at a 
distance of 600 million light years (a typical distance 
for the currently operational detectors), and a differ- 
ence in times on the order of seconds, the bound on the 
difference |1 — v,/c| could be as small as a part in 
1017. It is worth noting that a 2002 report that the 
speed of gravity had been measured by studying light 
from a quasar as it propagated past Jupiter was 
fundamentally flawed. That particular measurement 
was not sensitive to the speed of gravity. 


Conclusions 


The past four decades have witnessed a systematic, 
high-precision experimental verification of Einstein’s 
theories. Relativity has passed every test with flying 
colors. A central theme of future work will be to test 
strong-field gravity in the vicinity of black holes and 
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The Principle of Equivalence 


The special theory of relativity is founded on two 
basic principles: that the laws of physics should be 
independent of the uniform motion of an inertial 
frame of reference, and that the speed of light 
should have the same constant value in any such 
frame. In the years between 1905 and 1915, 
Einstein pondered deeply on what was, to him, a 
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neutron stars, and to see how well GR works on 
cosmological scales. Gamma-ray, X-ray, microwave, 
infrared, neutrino, and gravitational-wave astronomy 
will all play a critical role in probing these largely 
unexplored aspects of GR. 

GR is now the “standard model" of gravity. But, 
as in particle physics, there may be a world beyond 
the standard model. Quantum gravity, strings, and 
branes may lead to testable effects beyond Einstein's 
GR. Searches for such effects using laboratory 
experiments, particle accelerators, space instrumen- 
tation, and cosmological observations are likely to 
continue for some time to come. 


See also: Cosmology: Mathematical Aspects; Einstein 
Equations: Exact Solutions; General Relativity: Overview; 
Geometric Flows and the Penrose Inequality; 
Gravitational Lensing; Gravitational Waves; Standard 
Model of Particle Physics. 
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profound enigma, which was the issue of why these 
laws retain their proper form only in the case of an 
inertial frame. In special relativity, as had been the 
case in the earlier dynamics of Galilei-Newton, the 
laws indeed retain their basic form only when the 
reference frame is unaccelerated (which includes it 
being nonrotating). It demonstrated a particular 
prescience on the part of Einstein that he should 
have demanded the seemingly impossible require- 
ment that the very same dynamical laws should 
hold also in an accelerating (or even rotating) 
reference frame. The key realization came to him 
late in 1907, when sitting in his chair in the Bern 
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patent office he had the “happiest thought" in his life, 
namely that if a person were to fall freely in a 
gravitational field, then he would not notice that field 
at all while falling. The physical point at issue is 
Galileo’s early insight (itself having roots even earlier 
from Simon Stevin in 1586 or Ioannes Philiponos in 
the fifth or sixth century) that the acceleration 
induced by gravity is independent of the body upon 
which it acts. Accordingly, if two neighboring bodies 
are accelerated together in the same gravitational 
field, then the motion of one body, in the (nonrotat- 
ing) reference frame of the other, will be as though 
there were no gravitational field at all. To put this 
another way, the effect of a gravitational force is just 
like that of an accelerating reference system, and can 
be eliminated by free fall. This is now known as the 
“principle of equivalence.” 

It should be made clear that this is a particular 
feature of only the gravitational field. From the 
perspective of Newtonian dynamics, it is a conse- 
quence of the seemingly accidental fact that the 
concept of (passive) “mass” m that features in 
Newton’s law of gravitational attraction, where the 
attractive force due to the gravitational field of 
another body, of mass M, has the form 


GmM 
$2 


is the same as — or, at least, proportional to — the 
inertial mass m of the body which is being acted 
upon. Thus, the impedance to acceleration of a body 
and the strength of the attractive force on that body 
are, in the case of gravity (and only in the case of 
gravity), in proportion to one another, so that the 
acceleration of a body in a gravitational field is 
independent of its mass (or, indeed, of any other 
localized magnitude) possessed by it. (The fact that 
the active gravitational mass, here given by the 
quantity M, is also in proportion to its own passive 
gravitational mass — from Newton's third law — may 
be regarded as a feature of the general Lagrangian/ 
Hamiltonian framework of physics. But see Bondi 
(1957). Other forces of nature do not have this 
property. For example, the electrostatic force on a 
charged body, by an electric field, acts in proportion to 
the electric charge on that body, whereas, the 
impedance to acceleration is still the inertial mass of 
that body, so the acceleration induced depends on the 
charge-to-mass ratio. Accordingly, it is the gravita- 
tional field alone which is equivalent to an acceleration. 

Einstein's fundamental idea, therefore, was to 
take the view that the “relativity principle" could 
as well be applied to accelerating reference frames as 
to inertial ones, where the same physical laws would 
apply in each, but where now the perceived 


gravitational field would be different in the two 
frames. In accordance with this perspective, Einstein 
found it necessary to adopt a different viewpoint 
from the Newtonian one, both with regard to the 
notion of "gravitational force" and to the very 
notion of an “inertial frame." According to the 
Newtonian perspective, it would be appropriate to 
describe the action of the Earth's gravitational field, 
near some specific place on the Earth's surface, in 
terms of a *Newtonian inertial frame" in which the 
Earth is *fixed" (here we ignore the Earth's rotation 
and the Earth's motion about the Sun), and we 
consider that there is a constant gravitational field 
of force (directed towards the Earth's center). But 
the Einsteinian perspective is to regard that frame as 
noninertial where, instead, it would be a frame 
which falls freely in the Earth's (Newtonian) 
gravitational field that would be regarded as a 
suitable *Einsteinian inertial frame." Generally, to 
be inertial in Einstein's sense, the frame would refer 
to free fall under gravity, so that the Newtonian 
field of gravitational force would appear to have 
disappeared — in accordance with his “happiest 
thought" that Einstein had had in the Bern patent 
office. We see that the concept of a gravitational 
field must also be changed in the passage from 
Newton's to Einstein's viewpoint. For in Newton's 
picture we indeed have a "gravitational force" 
directed towards the ground with a magnitude of 
gm, where m is the mass of the body being acted 
upon and g is the *acceleration due to gravity" at 
the Earth's surface, whereas in Einstein's picture we 
have specifically eliminated this "gravitational 
force" by the choice of “Einsteinian inertial frame." 

It might at first seem puzzling that the gravitational 
field has appeared to have been removed altogether by 
this device, and it is natural to wonder how gravita- 
tional effects can have any physical role to play at all 
from this point of view! However, this would be to go 
too far, as the Newtonian gravitational field may vary 
from place to place - as it does, indeed, in the case of 
the Earth's field, since it is directed towards the Earth's 
center, which is a different spatial direction at different 
places on the Earth's surface. Our considerations up to 
this point really refer only to a small neighborhood of a 
point. One might well take the view that a “frame” 
ought really to describe things also at widely separated 
places at once, and the considerations of the para- 
graphs above do not really take this into consideration. 


The Tidal Effect 


To proceed further, it will be helpful to consider an 
astronaut A in free fall, high above the Earth's 
surface. Let us first adopt a Newtonian perspective. 


We shall be concerned only with the instantaneous 
accelerations due to gravity in the neighborhood of 
A, so it will be immaterial whether we regard the 
astronaut as falling to the ground or - more 
comfortably! — in orbit about the Earth. 

Let us imagine that the astronaut is initially 
surrounded, nearby, by a sphere of particles, with A 
at the centre, which are taken to be initially at rest 
with respect to A (see Figure 1). To a first 
approximation, all the particles will share the same 
acceleration as the astronaut, so they will seem to the 
astronaut to hover motionless all around. But now let 
us be a little more precise about the accelerations. 
Those particles which are initially located in a vertical 
line from A, that is, either directly below A, at B, or 
directly above A, at T, will have, like A, an 
acceleration which is in the direction AO, where O 
is the Earth's center. But for the bottom point B, the 
acceleration will be slightly greater than that at A, and 
for the top point T, the acceleration will be slightly 
less than the acceleration at A, because of the slightly 
differing distances from O. Thus, relative to A, both 
will initially accelerate away from A. With regard to 
particles in the sphere which are initially in a circle in 
the horizontal plane through A, the direction to O 
will now be somewhat inwards, so that the particles 


(a) (b) 


Figure 1 (a) Tidal effect. The astronaut A surrounded by a 
sphere of nearby particles initially at rest with respect to A. In 
Newtonian terms, they have an acceleration towards the Earth's 
center E, varying slightly in direction and magnitude (single- 
shafted arrows). By subtracting A's acceleration from each, we 
obtain the accelerations relative to A (double-shafted arrows); 
this relative acceleration is slightly inward for those particles 
displaced horizontally from A, but slightly outward for those 
displaced vertically from A. Accordingly, the sphere becomes 
distorted into a (prolate) ellipsoid of revolution, with symmetry 
axis in the direction AE. The initial distortion preserves volume. 
(b) Now move A to the Earth's center E and the sphere of 
particles to surround E just above the atmosphere. The 
acceleration (relative to A — E) is inward all around the sphere, 
with an initial volume reduction acceleration 47 GM, where M is 
the total mass surrounded. Reproduced with permission from 
Penrose R (2004) The Road to Reality: A Complete Guide to the 
Laws of the Universe. London: Jonathan Cape. 
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at these points H; will accelerate, relative to A, 
slightly inwards. Accordingly, the entire sphere of 
particles will begin to get distorted into a prolate 
spheroid (elongated ellipsoid of revolution). This is 
referred to as the tidal distortion, for the good reason 
that it is precisely the same physical effect which is 
responsible for the tides in the Earth’s oceans, where 
for this illustration we are to think of the Earth’s 
center as being at A, the Moon (or Sun) to be situated 
at O, and the sphere of particles to represent the 
surface of the water of the Earth’s oceans. 

It is not hard to calculate (reverting, now, to our 
original picture) that, as a reflection of Newton’s 
inverse-square law of gravitational attraction, the 
amount of (small) outward vertical displacement 
from A (at B and T) will be twice the inward 
horizontal displacement (over the circle of points 
Hj); accordingly, the sphere will initially be distorted 
into an ellipsoid of the same volume. This depends 
upon there being no gravitating matter inside the 
sphere. The presence of such matter would con- 
tribute a volume-reducing effect in proportion to the 
total mass surrounded. (An extreme case illustrating 
this would occur if we take our sphere of particles to 
surround the entire Earth, where the volume- 
reducing effect would be manifest in the accelera- 
tions towards the ground at all points of the 
surrounding sphere.) 


Gravity as Curved Spacetime 


It is appropriate to take a spacetime view of these 
phenomena (Figure 2). The distortions that we have 
been considering are, in fact, direct manifestations 


(a) (b) 


Figure 2 Spacetime versions of Figure 1 in terms of the 
relative distortion of neighboring geodesics. (a) Geodesic 
deviation in empty space (basically Weyl curvature) as seen in 
the world lines of A and surrounding particles (one spatial 
dimension suppressed), as might be induced from the gravita- 
tional field of a nearby body E. (b) The corresponding inward 
acceleration (basically Ricci curvature) due to the mass density 
within the bundle of geodesics. Reproduced with permission 
from Penrose R (2004) The Road to Reality: A Complete Guide 
to the Laws of the Universe. London: Jonathan Cape. 
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of spacetime curvature, according to Einstein’s 
viewpoint. We are to think of the world line of a 
particle, falling freely under gravity (Einsteinian 
inertial motion), as described as some kind of 
geodesic in spacetime. We shall be coming to this 
more completely shortly, but for the moment it will 
be helpful to picture the behavior of geodesics 
within an ordinary curved 2-surface S (Figure 3). 
If S has positive (Gaussian) curvature, then there 
will be a tendency for geodesics on S to bend 
towards each other, so that a pair of infinitesimally 
separated geodesics which are initially parallel will 
begin to get closer together as we move along them; 
if S has negative (Gaussian) curvature, then there 
will be a corresponding tendency for geodesics on S 
to bend away from each other. This is what happens 
in two dimensions, where the intrinsic curvature at a 
point is given by a single number. However, we are 
now concerned with a four-dimensional space, 
where the notion of curvature requires many more 
components. We see in Figure 2 that we are indeed 
to expect mixtures of convergence and divergence of 
geodesics, which suggests that there are both 
positive and negative curvature components 
involved, the positive curvature being in the hor- 
izontally displaced directions from A and the 
negative curvature in the vertically displaced direc- 
tions. In a curved space of dimension 4, as is the 
case for a curved spacetime, we can expect 20 
independent components of curvature at each point 
altogether. In the present situation, the others would 
be called into play when differing velocities of A are 
considered. 

Let us see how we are to accommodate the above 
considerations within the standard framework of 
differential geometry. So far, we have not really 
deviated from Newtonian theory, even though we 
have been considering “geodesics” in a four-dimen- 
sional spacetime. In fact, it is perfectly legitimate to 
view Newtonian theory in this way (see Newtonian 


(b) 


Figure 3 Geodesic deviation when M is a 2-surface (a) of 
positive (Gaussian) curvature, when the geodesics y, 7** bend 
towards each other, and (b) of negative curvature, when they 
bend apart. Reproduced with permission from Penrose R (2004) 
The Road to Reality: A Complete Guide to the Laws of the 
Universe. London: Jonathan Cape. 


Limit of General Relativity), although the 4-geometry 
description is somewhat more complicated than one 
might wish. This is due to the fact that the infinite 
speed at which gravitation is taken to act in 
Newtonian theory demands that the “metric” of 
Newtonian spacetime is degenerate. (In effect, one 
would have a degenerate “dual metric” G” of 
matrix rank 3, which plays a role in defining spatial 
displacements and a very degenerate “metric” Gap, of 
matrix rank 1, which defines temporal differences, 
where G^ G,, —0; see Newtonian Limit of General 
Relativity.) Accordingly, there is no unique notion of 
“geodesic” defined by the metric in Newtonian 
theory. 

It is striking that although the insights provided 
by the principle of equivalence are to some 
considerable extent independent of special relativity 
(since we see from the paragraphs prior to the 
preceding one that a curved-spacetime-geometry 
view of gravity is natural in the light of the 
equivalence principle alone), it is the nondegenerate 
metric g,5, (and its inverse g^^) that special relativity 
gives us locally, which leads to an elegant space- 
time theory of gravity. Although the metric gap 
is Lorentzian (with preferred choice of signature 
二 一 一 一 here) rather than positive definite, so that 
the spacetime is not strictly a Riemannian one, the 
change of signature makes little difference to 
the local formalism. In particular, the fact that the 
metric defines a unique (torsion-free) connection 
preserving it is unaffected by the signature. This 
connection is the one defined by Christoffel's symbols 


D = 1 g” (0,g,, E Oaged » Oda) 


where ð, stands for coordinate derivative O/Ox^, 
so that the covariant derivative of a vector V^ is 
given by 


VaV’ -8,V* VT 


(Here the standard “physicist’s conventions" are 
being used, whereby notation such as “g,,” and 
* V?" can be used interchangeably either for the sets 
of components of the metric tensor g and the vector 
V, respectively, or alternatively for the entire 
geometrical metric tensor g or vector V, in each 
case; moreover, the summation convention is being 
assumed, or this can alternatively be understood in 
terms of abstract indices. (For the abstract-index 
notation for tensors, see Penrose and Rindler (1984), 
especially Chapters 2 and 4. Sign and index-ordering 
conventions used here follow those given in that 
book. Many other authors use conventions which 
differ from these in various, usually minor 
respects.)) 


Physical Interpretation of the Metric 


Some words of clarification are needed, as to the 
meaning of the metric tensor g, in relativity theory. 
In the early discussions by Einstein and others, the 
spacetime metric tended to be interpreted in terms of 
little “rulers” placed on a curved manifold. 
Although this is natural in the Riemannian 
(positive-definite) case, it is not quite so appropriate 
for the Lorentzian geometry of spacetime manifolds. 
An ordinary physical ruler has a spacetime descrip- 
tion as a timelike strip, and it does not naturally 
express the spatial separation between two 
spacelike-separated events. In order for a ruler to 
measure such a spacelike separation, it would be 
necessary for the two events to be simultaneous in 
the ruler's rest frame, and for this to be assured, 
some further mechanism would be needed, such as 
Einstein's procedure for ensuring simultaneity by the 
use of light signals from the two events to be 
received simultaneously at their midpoint on the 
ruler. Clearly this complicates the issue, and it turns 
out to be much preferable to concentrate on 
temporal displacements rather than spatial ones. 

The idea that spacetime geometry should really be 
regarded as “chronometry,” in this way, has been 
stressed by a number of distinguished expositors of 
relativity theory, most notably John L Synge (1956, 
1960) and Hermann Bondi (1961, 1964, 1967). 
Where needed, spatial displacements can then be 
defined by the use of temporal ones together with 
light signals. This has the additional advantage that 
in modern technology, the measurement of (proper) 
time far surpasses that of distance in accuracy, to the 
extent that the meter is now defined simply by the 
requirement that there are exactly 299792458 of 
them in a light-second! The proper time interval 
between two nearby events is, indeed, measured by a 
clock which encounters both events, moving iner- 
tially between the two, and very precise atomic and 
nuclear clocks are now a common feature of current 
technology. The physical role of the metric gp is 
most clearly seen in the formula 


q 
T= J (gap dx^ dx’) 
p 


which measures the (proper) time interval 7 between 
an event p and a later event g on its world line, the 
integral being taken along this curve, and where now 
that curve need not be a geodesic, so that accelerating 
(noninertial) motion of the clock is allowed. The 
metric (with choice of signature + — —— so that it is 
the timelike displacements that are directly provided 
as real numbers) is very precisely specified by this 
physical requirement, and this tells us that the 
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pseudo-Riemannian (Lorentzian) structure of space- 
time is far from being an arbitrary construction, but 
is given to us by Nature with enormous precision. 
(Some theorists prefer to use the alternative spacetime 
signature — + + + , because this more directly relates 
to familiar Newtonian concepts, these being normally 
described in spatial terms. The difference is essentially 
just a notational one, however. It may be remarked 
that the 2-spinor formalism (see Spinors and 
Spin Coefficients) fits in much more readily with the 
+ + —— signature being used here.) It may be noted, 
also that this time measure is ultimately fixed by 
quantum principles and the masses of the elementary 
ingredients involved (e.g., particle masses) via the 
Einstein and Planck relations E = mce? and E = bv, so 
that there is a natural frequency associated with a 
given mass, via v — mc? /b (c being the speed of light 
and þh being Planck’s constant). 


Riemann Curvature and Geodesic 
Deviation 


The unique torsion-free (Christoffel-Levi-Civita) 
connection V, is, via this physically determined 
metric, also fixed accordingly by these physical 
considerations, as is the notion of a geodesic, and 
therefore so also is the curvature. The 20-independent- 
component Riemann curvature tensor R,,.q may be 


defined by 
(V,Vy — Vy V4)VÀ = Rape? V* 


with normal index-raising/lowering conventions, so 
that Rabed = R^, g,4, etc., and we have the standard 
classical formula 


Rus 2 OT 4^ = OT ^ EE Pla Do - lua n 


The symmetries R gped = Redab = —Rbacds Rabed + Rbcaa + 
Raba =9 reduce the number of independent compo- 
nents of Ruy to 20 (from a potential 4*—64). 
Of these, 10 are locally fixed by the kind of 
physical requirement indicated above, that in order 
to express something that agrees closely with New- 
ton’s inverse-square law we require that there should 
be a net inward curving of free world lines (the 
timelike geodesics that represent local inertial 
motions, or “free fall” under gravity). Let us see 
how this requirement is satisfied in Einstein’s general 
relativity. 

What we find, from Newton’s theory, is that a 
system of test particles which, at some initial time 
constitutes a closed 2-surface at rest surrounding 
some gravitating matter, will begin to accelerate in 
such a way that the volume surrounded is initially 
reduced in proportion to the total mass surrounded. 
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This volume reduction is a direct consequence of 
Poisson’s equation V7 = 一 4rp (® being the gravi- 
tational potential and p the mass density) and of 
Newton’s second law, which tell us that the second 
time derivative of the free-fall volume of our initially 
stationary closed surface of test particles is indeed 
—47GM, where M is the total gravitating mass 
surrounded (and G is Newton’s constant, as above). 
In Einstein’s theory, we can basically carry this over 
to our four-dimensional Lorentzian spacetime. 
We do, however, find that such a general statement 
as this does not exactly hold. Instead of referring to 
3-volumes of any size, we must restrict attention to 
infinitesimal volumes. 

The basic mathematical tool is the equation of 
“geodesic deviation,” namely the “Jacobi equation”: 


where D describes “propagation derivative” 
D ziv. 


along a timelike geodesic y, where ¢ is a unit 
timelike tangent vector to y (so t,t? = g,,l^ t? — 1) 
which is (consequently) parallel-propagated along y, 


Dx d 


(When acting on a scalar quantity defined along y, 
we can read “D” as “d/dr,” where 7 measures 
proper time along y.) The vector z^ is what is called 
a connecting vector between the geodesic y and 
some “neighboring geodesic” y. We think of the 
vector 4^ as “connecting” a point p on ^ to some 
neighboring point p’ on y, where it is usual to take 
4^ to be orthogonal to £ (ie. ,t^—0). The 
derivative Du* measures the rate of change of u’, 
as p and p' move together into the future along y. 
Mathematically, we express this as the vanishing of 
the Lie derivative of 4^ with respect to £^ (with £, 
extended to a unit vector field which is tangent both 
to y and to 7’). By taking three independent vectors z^ 
at p, we can form a spatial 3-volume element W and 
investigate how this propagates along 7. We find 


DW = WR,,t^t" 
where the Ricci tensor Ri,(= Rpa) is here defined by 
Rap = Rach” 


The Einstein Field Equations 


In view of what has been said above, with regard to 
the way that the acceleration of volume behaves in 
Newtonian theory, it would be natural to “iden- 
tify” Ratt? with (一 4rG x) the (active gravita- 
tional) mass density, with respect to the time 


direction 7^. In (special) relativity theory we expect 
to identify mass density with c? x energy density 
(by E — mc?^) and to take energy density as just one 
component (the time-time component) of a sym- 
metric tensor To called the “energy tensor," and 
for simplicity we now take c=1. The tensor 
quantity T,, is to incorporate the contributions to 
the local mass/energy density of all particles and 
fields other than gravity itself. Since we would 
require this to work for all choices of time- 
direction £^, it would be natural, accordingly, to 
make the identification 


Rap = —4rG Tus 


Indeed, this was Einstein's initial choice for a 
gravitational field equation. However, this will 
actually not do, as Einstein later realized. The 
trouble comes from the Bianchi identity 


Va Rpcde + Vb Reade + Ve Rabde = 0 
from which we deduce 
V^ (Rap = 3Rg,;) = 0 
where 
R = K,* 


This causes trouble in connection with the standard 
requirement on the energy tensor, that it satisfy the 
local “conservation law” 


Vv” Tap = 9 


The latter equation is an essential requirement in special 
relativity, since it expresses the conservation of energy 
and momentum for fields in flat spacetime. In standard 
Minkowski coordinates, each of T30, Ta1, Ta2, Ta3 
satisfies an equation just like the V^/; — O0 of the 
charge-current vector J, of Maxwell’s theory of 
electromagnetism, with now V, — 0; — 0/O0x^, which 
expresses global conservation of charge. Similarly to the 
way that Ją encapsulates density and flux of electric 
charge, To encapsulates density and flux of energy, and 
Ta, 142, 143 encapsulate the same for the three 
components of momentum. So the equation 
V^ T 4 = 0 is essential in special relativity, for similarly 
expressing global conservation of energy and momen- 
tum. We find (referring to a local inertial frame) that, 
when we pass to general relativity, this equation should 
still hold, with V, now standing for covariant 
derivative. But the initially proposed field equation 
Rap = 一 4TGT would now give us VR, — 0, which 
combined with the geometrically necessary V^(R,j, 一 
(3) Rg,;) —0, tells us that R is constant. In turn this 
implies the physically unacceptable requirement that 
T = T? is constant (since we have R= —4«GT). 
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Einstein eventually became convinced (by 1915) 
of the modified field equations 


Rap E" sRgap — —8nG Tab 


(the “8” rather than “4” being now needed to fit in 
with the Newtonian limit) and it is these that are 
now commonly referred to as “Einstein’s field 
equations.” (Some authors prefer to use the singular 
form “field equation,” especially if the formula is to 
be read as an abstract-index expression rather than a 
family of component equations, since the tensors 
involved are really single entities.) It may be noted 
that the formula can be rewritten as 


Rab 7T —8rG (Typ -i T gab ) 


from which we deduce that in Einstein’s theory the 
source of gravity is not simply the mass (or equivalently 
energy) density, but there is an additional contribution 
from the pressure (momentum flux, i.e., space-space 
components of T,,,). This can have significant implica- 
tions for the instability of very large and massive stars in 
highly relativistic regimes, where increases in pressure 
can, paradoxically, actually increase the tendency for a 
star to collapse, owing to its contribution to the 
attractive effect of its gravity. 

In 1917, Einstein put forward a slight modifica- 
tion of his field equations — basically the only 
modification that can be made without fundamen- 
tally changing the foundations of his theory — by 
introducing the very tiny cosmological constant A. 
The modified equations are 


Rap = 5Rgab T Agar, ey —81GT;, 


and the source of gravity, or active gravitational 
mass is DOW 


A 
tha = 
p+ Py + Pst Py =z 


where (with respect to a local Lorentzian-orthonormal 
frame, units being chosen so that c= 1) p= Too is the 
mass/energy density and P, = T11, P2 = T22, P3 = T33 
are the principal pressures. The A-term, for positive A, 
provides a repulsive contribution to the gravitational 
effect, but it is extremely tiny (and totally ignorable) on 
all ordinary scales, beginning to show itself only at the 
most vast of observed cosmological distances (since the 
effect of A adds up relentlessly at larger and larger 
distances). Einstein originally introduced the term in 
order to have the possibility of a static universe, where 
the attractive gravitational effect of the totality of 
ordinary matter would be balanced, overall, by A. But 
the discovery of the expansion of the universe (by 
Hubble and others) led Einstein to abandon the 
cosmological term. However, since 1998 (initially 


from the supernova observations of Brian Schmidt and 
Robert Kirschner, and Saul Perlmutter, see Perlmutter 
et al. (1998)), cosmological evidence has mounted in 
favor of the presence of a very small positive A-term, 
which has resulted in the expansion of the universe 
beginning to accelerate. While the presence of Einstein's 
constant A-term is consistent with observations, and 
remains the simplest explanation of this observed 
acceleration, many cosmologists prefer to allow for 
what would amount to a *varying A," and refer to it as 
*dark energy." 


Energy Conservation and Related Matters 


One of the features of Einstein’s general relativity 
theory that had been deeply puzzling to a good many of 
Einstein's contemporaries, and which may be said to be 
still not fully resolved, even today, is *energy conserva- 
tion,” in the presence of a dynamical gravitational field. 
We have noted that the energy tensor T, is to 
incorporate the contributions of all particles and fields 
other than gravity. But what about gravity itself? There 
are many physical situations in which energy can be 
transferred back and forth from gravitational systems 
to nongravitational ones (most strikingly in the example 
ofthe emission of gravitational waves; see Gravitational 
Waves). The conservation of energy would make no 
sense without an understanding of how energy can be 
stored in a gravitational field. At first sight we seem to 
see no role for a gravitational contribution to energy in 
Einstein's theory, since the conservation law V“T,, — 0 
seems to be a self-contained expression of energy 
conservation with no direct contribution from the 
gravitational field in the tensor T,,,. However, this is 
illusory, since the formulation of a global conservation 
law from the local covariant expression VT, = 0 does 
not work in curved spacetime (basically because, unlike 
the charge-current quantity /; of Maxwell's electro- 
dynamical theory, the extra index on T,, prevents it 
from being regarded as a 1-form). We may take the view 
that the energy of gravitation enters nonlocally into the 
equation, so that the failure of To to provide a global 
conservation law on its own is an expression of the 
gravitational contributions of energy not being taken 
into account. This is no doubt a correct attitude to take, 
but it is a difficult one to express comprehensively in a 
mathematical form. Einstein himself provided a partial 
understanding, but at the expense of introducing 
concepts known as “pseudotensors” whose meaning 
was too tied up with arbitrary choices of coordinate 
systems to provide an overall picture. In modern 
approaches, the most clear-cut results come from the 
study of asymptotically flat or asymptotically de Sitter 
spacetimes (de Sitter space being the empty universe 
which takes over the role of Minkowski space when 
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there is a positive cosmological constant A; see 
Cosmology: Mathematical Aspects). 
The important role of the “Weyl conformal tensor” 


Chea = Kabed — X(Racgbd m Rbcgad 十 Rbagac — Rad8be) 
+ ZR (gac£ba cx £bcEad) 


should also be pointed out. This tensor retains all 
the symmetries of the full Riemann tensor, but has 
the Ricci tensor contribution removed, so that all its 
contractions vanish, as is exemplified by 


Cate =0 


It describes the conformal part of the curvature, that 
is, that part that survives under conformal rescalings 
of the metric; 


Bab ^ V gab 


where €) is a smooth (positive) function of position. The 
tensor Cd is itself invariant under these conformal 
rescalings. This has importance in the asymptotic 
analysis of gravitational fields (see Asymptotic Struc- 
ture and Conformal Infinity). We may take the view 
that C,,.4 describes the degrees of freedom in the free 
gravitational field, whereas R,, contains the informa- 
tion of the sources of gravity. This is analogous to the 
Maxwell tensor FE describing the degrees of freedom 
in the free electromagnetic field, whereas J, contains 
the information of the sources of electromagnetism. 
From the observational point of view, general 
relativity stands in excellent shape, with full agreement 
with all known relevant data, starting with the 
anomalous perihelion advance of the planet Mercury 
observed by LeVerrier in the mid-nineteenth century, 
through clock-slowing, light-bending (lensing) and 
time-delay effects, and the necessary corrections to 
GPS positioning systems, to the precise orbiting of 
double neutron-star systems, with energy loss due to the 
emission of gravitational waves. The effects of gravita- 
tional lensing now play vital roles in modern cosmology. 
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Introduction 


The state of a concrete system (from physics, 
chemistry, ecology, or other sciences) is described 
using (finitely many, say n) observable quantities 
(e.g., positions and velocities for mechanical 
systems, population densities for echological 
systems, etc.). Hence, the state of a system may be 


To get some idea of the precision in Einstein’s theory, we 
may take note of the fact that the double neutron-star 
system PSR 1913+16 has been observed for some 
30 years, and the agreement between observation and 
theory overall is to about one part in 10". 


See also: Asymptotic Structure and Conformal Infinity; 
Canonical General Relativity; Computational Methods in 
General Relativity: the Theory; Cosmology: Mathematical 
Aspects; Einstein Equations: Exact Solutions; Einstein 
Equations: Initial Value Formulation; Einstein-Cartan 
Theory; Einstein's Equations with Matter; General 
Relativity: Experimental Tests; Geometric Flows and the 
Penrose Inequality; Gravitational Lensing; Gravitational 
Waves; Hamiltonian Reduction of Einstein's Equations; 
Lorentzian Geometry; Newtonian Limit of General 
Relativity; Noncommutative geometry and the Standard 
Model; Spacetime Topology, Causal Structure and 
Singularities; Spinors and Spin Coefficients; Symmetries 
and Conservation Laws; Twistor Theory: Some 
Applications [in Integrable Systems, Complex Geometry 
and String Theory]. 
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Generic Properties of Dynamical Systems 


represented as a point x in a geometrical space R". 
In many cases, the quantities describing the state are 
related, so that the phase space (space of all possible 
states) is a submanifold M c R”. The time evolution 
of the system is represented by a curve xi,t € R 
drawn on the phase space M, or by a sequence x, € 
M, n € Z, if we consider discrete time (i.e., every 
day at the same time, or every January 1st). 
Believing in determinism, and if the system is 
isolated from external influences, the state x9 of the 
system at the present time determines its evolution. 
For continuous-time systems, the infinitesimal 


evolution is given by a differential equation or vector 
field dx/dt — X(x); the vector X(x) represents velo- 
city and direction of the evolution. For a discrete-time 
system, the evolution rule is a function F: M — M; if 
x is the state at time £, then F(x) is the state at the 
time ¢+ 1. The evolution of the system, starting at 
the initial data xo, is described by the orbit of xo, that 
is, the sequence ((x,),-z |Xn41=F(xn)} (discrete 
time) or the maximal solution x, of the differential 
equation ax/dt = X(x) (continuous time). 


General problem Knowing the initial data and tbe 
infinitesimal evolution rule, what can we tell about 
the long-time evolution of tbe system? 


The dynamics of a dynamical system (differential 
equation or function) is the behavior of the orbits, 
when the time tends to infinity. The aim of 
“dynamical systems" is to produce a general 
procedure for describing the dynamics of any 
system. For example, Conley's theory presented in 
the next section organizes the global dymamics of a 
general system using regions concentrating the orbit 
accumulation and recurrence and splits these regions 
in elementary pieces: the chain recurrence classes. 

We focus our study on C’-diffeomorphisms F (i.e., F 
and F are r times continuously derivable) on a 
compact smooth manifold M (most of the notions and 
results presented here also hold for vector fields). Even 
for very regular systems (F algebraic) of a low- 
dimensional space (dim (M) — 2), the dynamics may 
be chaotic and very unstable: one cannot hope for a 
precise description of all systems. Furthermore, neither 
the initial data of a concrete system nor the infinitesi- 
mal-evolution rule are known exactly: fragile proper- 
ties describe the evolution of the theoretical model, and 
not of the real system. For these reasons, we are mostly 
interested in properties that are persistent, in some 
sense, by small perturbations of the dynamical system. 
The notion of small perturbations of the system 
requires a topology on the space Diff (M) of C’- 
diffeomorphisms: two diffeomorphisms are close for 
the C’-topology if all their partial derivatives of order 
<r are close at each point of M. Endowed with this 
topology, Diff' (M) is a complete metric space. 

The open and dense subsets of Diff' (M) provide the 
natural topological notion of *almost all" F. Genericity 
is a weaker notion: by Baire's theorem, if O;,i € N, are 
dense and open subsets, the intersection [jey Oi is a 
dense subset. A subset is called residual if it contains 
such a countable intersection of dense open subsets. A 
property P is generic if it is verified on a residual 
subset. By a practical abuse of language, one says: 


“C’-generic diffeomorphisms verify P” 


A countable intersection of residual sets is a residual 
set. Hence, if {P;},i€ N, is a countable family of 
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generic properties, generic diffeomorphisms verify 
simultanuously all the properties 7;. 

A property P is C'-robust if the set of diffeo- 
morphisms verifying P is open in Diff'(M). A 
property 7 is locally generic if there is an (nonempty) 
open set O on which it is generic, that is, there is 
residual set R such that P is verified on RN O. 

The properties of generic dynamical systems 
depend mostly on the dimension of the manifold M 
and of the C’-topology considered, r € N U {+00} 
(an important problem is that C'-generic diffeo- 
morphisms are not C’*! ): 


e On very low dimensional spaces (diffeomorphisms of 
the circle and vector fields on compact surfaces) the 
dynamics of generic systems (indeed in a open and 
dense subset of systems) is very simple (called Morse- 
Smale) and well understood; see the subsection 
“Generic properties of the low-dimensional 
systems.” 

e In higher dimensions, for C’-topology, r > 1, one 
has generic and locally generic properties related 
to the periodic orbits, like the Kupka-Smale 
property (see the subsection *Kupka-Smale theo- 
rem") and the Newhouse phenomenon (see the 
subsection *Local C?-genericity of wild behavior 
for surface diffeomorphisms"). However, we still 
do not know if the dynamics of C’-generic 
diffeomorphisms is well approached by their 
periodic orbits, so that one is still far from a 
global understanding of C’-generic dynamics. 

e For the C'-topology, perturbation lemmas show that 
the global dynamics is very well approximated by 
periodic orbits (see the section “C!-generic systems: 
global dynamics and periodic orbits”). One then 
divides generic systems in “tame” systems, with a 
global dynamics analoguous to hyperbolic dynamics, 
and “wild” systems, which present infinitely many 
dynamically independent regions. The notion of 
dominated splitting (see the section “Hyperbolic 
properties of C'-generic diffeormorphisms") seems 
to play an important role in this division. 


Results on General Systems 
Notions of Recurrence 


Some regions of M are considered as the heart of the 
dynamics: 


e Per(F) denotes the set of periodic points x € M of 
F, that is, F"(x) — x for some n > 0. 

® A point x is recurrent if its orbit comes back 
arbitrarily close to x, infinitely many times. 
Rec(F) denotes the set of recurrent points. 

e The limit set Lim(F) is the union of all the 
accumulation points of all the orbits of F. 
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e A point x is “wandering” if it admits a neighbor- 
hood U, CM disjoint from all its iterates 
F"(U,),n > 0. The nonwandering set Q(F) is the 
set of the nonwandering points. 

e R(F) is the set of chain recurrent points, that is, 
points x € M which look like periodic points if we 
allow small mistakes at each iteration: for any 
E>0, there-is a sequence x-x6,X1,...,X4 =X 
where d(f(x;,x;;1) <€ (such a sequence is an 
€-pseudo-orbit). 


A periodic point is recurrent, a recurrent point is a 
limit point, a limit point is nonwandering, and a 
nonwandering point is chain recurrent: 


Per(F) C Rec(F) C Lim(F) C O(F) C R(F) 


All these sets are invariant under F, and Q(F) and 
R(F) are compact subsets of M. There are diffeo- 
morphisms F for which the closures of these sets are 
distinct: 


e A rotation x++x+a with irrational angle o € 
R\Q on the circle S! — R/Z has no periodic 
points but every point is recurrent. 

e The map xx + (1/4r)(1 + cos (2rx)) induces 
on the circle S' a diffeomorphism F having a 
unique fixed point at x= 1/2; one verifies that 
Q(F)={1/2} and R(F) is the whole circle S. 


An invariant compact set K C M is transitive if there 
is x € K whose forward orbit is dense in K. Generic 
points x € K have their forward and backward 
orbits dense in K: in this sense, transitive sets are 
dynamically indecomposable. 


Conley's Theory: Pairs Attractor/Repeller and 
Chain Recurrence Classes 


A trapping region U C M is a compact set whose 
image F(U) is contained in the interior of U. By 
definition, the intersection A= (),., F"(U) is an 
attractor of F: any orbit in U *goes to A." Denote by 
V the complement of the interior of U: it is a trapping 
region for F^! and the intersection R = ,oF "(V) is 
a repeller. Each orbit either is contained in A U R, or 
*goes from the repeller to the attractor.” More 
precisely, there is a smooth function v: M — [0,1] 
(called Lyapunov function) equal to 1 on R and 0 on A, 
and strictly decreasing on the other orbits: 


v(F(x)) € w(x) for xeAUR 


So, the chain recurrent set is contained in AUR. 
Any compact set contained in U and containing the 
interior of F(U) is a trapping region inducing the 
same attracter and repeller pair (A,R); hence, the set 
of attracter/repeller pairs is countable. We denote by 
(Aj, Rij, vj), i € IN, the family of these pairs endowed 


with an associated Lyapunov function. Conley 
(1978) proved that 


R(F) = ( (AU R) 
isN 
This induces a natural partition of R(F) in equiva- 
lence classes: x ~ y if x € A; & y € A;. Conley proved 
that x ~ y iff, for any £ > 0, there are e-pseudo orbits 
from x to y and vice versa. The equivalence classes 
for ~ are called chain recurrence classes. 

Now, considering an average of the Lyapunov 
functions 7; one gets the following result: there is a 
continuous function vg: M— R with the following 
properties: 


e o(F(x)) € y(x) for every x € M, (Lie, v is a 
Lyapunov function); 

e q(F(x)) = oix) & x € R(F); 

e for x,y € R(F, yp(x)= o(y) &x ~ y; and 

e the image y(R(F) is a compact subset of R with 
empty interior. 


This result is called the “fundamental theorem of 
dynamical systems" by several authors (see 
Robinson (1999)). 

Any orbit is y-decreasing from a chain recurrence 
class to another chain reccurence class (the global 
dynamics of F looks like the dynamics of the 
gradient flow of a function à, the chain recurrence 
classes supplying the singularities of @). However, 
this description of the dynamics may be very rough: 
if F preserves the volume, Poincaré’s recurrence 
theorem implies that O(F) = R(F) = M; the whole M 
is the unique chain recurrence class and the function 
y of Conley’s theorem is constant. 

Conley’s theory provides a general procedure for 
describing the global topological dynamics of a 
system: one has to characterize the chain recurrence 
classes, the dynamics in restriction to each class, 
the stable set of each class (i.e., the set of points 
whose positive orbits goes to the class), and the 
relative positions of these stable sets. 


Hyperbolicity 


Smale’s hyperbolic theory is the first attempt to give 
a global vision of almost all dynamical systems. In 
this section we give a very quick overview of this 
theory. For further details, see Hyperbolic Dynami- 
cal Systems. 


Hyperbolic Periodic Orbits 


A fixed point x of F is hyperbolic if the derivative 
DF(x) has no (neither real nor complex) eigenvalue 
with modulus equal to 1. The tangent space at x 


splits as TM = E p E", where Es and E" are the 
DF(x)-invariant spaces corresponding to the eigen- 
values of moduli < 1 and > 1, respectively. There are 
C’-injectively immersed F-invariant submanifolds 
W*(x) and W"(x) tangent at x to E° and E"; the 
stable manifold W*(x) is the set of points y whose 
forward orbit goes to x. The implicit-function 
theorem implies that a hyperbolic fixed point x 
varies (locally) continuously with F; (compact parts 
of) the stable and unstable manifolds vary continu- 
ously for the C'-topology when F varies with the 
C’-topology. 

A periodic point x of period n is hyperbolic if it is 
a hyperbolic fixed point of F" and its invariant 
manifolds are the corresponding invariant manifolds 
for F”. The stable and unstable manifold of the orbit 
of x, W*,(x) and W" (x), are the unions of the 


0 orb 
invariant manifolds of the points in the orbit. 


Homoclinic Classes 


Distinct stable manifolds are always disjoint; how- 
ever, stable and unstable manifolds may intersect. At 
the end of the nineteenth century, Poincaré noted 
that the existence of transverse homoclinic orbits, 
that is, transverse intersection of WS (x) with 
W" p(x) (other than the orbit of x), implies a very 
rich dynamical behavior: indeed, Birkhoff proved 
that any transverse homoclinic point is accumulated 
by a sequence of periodic orbits (see Figure 1). The 
homoclinic class H(x) of a periodic orbit is the 
closure of the transverse homoclinic point associated 
to x: 


H(p) = Wa (x) Was (x) 


There is an equivalent definition of the homoclinic 
clas of x: we say that two hyperbolic periodic 
points x and y are homoclinically related if WS , (x) 
and WY,(x) intersect transversally W".(y) and 
Ws. (y), respectively; this defines an equivalence 
relation in Per,,,(F) and the homoclinic classes are 
the closure of the equivalence classes. 

The homoclinic classes are transitive invariant 


compact sets canonically associated to the periodic 


em) f(x) 


Figure 1 A transverse homoclinic orbit. 


Generic Properties of Dynamical Systems 497 


orbits. However, for general systems, homoclinic 
classes are not necessarily disjoint. 
For more details, see Homoclinic Phenomena. 


Smale’s Hyperbolic Theory 


A diffeomorphism F is Morse-Smale if Q(F) = Per(F) 
is finite and hyperbolic, and if W*(x) is tranverse to 
W"(y) for any x,y € Per(F). Morse-Smale diffeo- 
morphisms have a very simple dynamics, similar to 
the one of the gradient flow of a Morse function; apart 
from periodic points and invariant manifolds of 
periodic saddles, each orbit goes from a source to a 
sink (hyperbolic periodic repellers and attractors). 
Furthermore, Morse-Smale diffeomorphisms are 
Cl-structurally stable, that is, any diffeomorphism 
C!-close to F is conjugated to F by a homeomorphism: 
the topological dynamics of F remains unchanged by 
small C'-perturbation. Morse-Smale vector fields 
were known (Andronov and Pontryagin, 1937) to 
characterize the structural stability of vector fields on 
the sphere S?. However, a diffeomorphism having 
transverse homoclinic intersections is robustly not 
Morse-Smale, so that Morse-Smale diffeomorphisms 
are not C'-dense on any compact manifold of 
dimension >2. In the early 1960s, Smale generalized 
the notion of hyperbolicity for nonperiodic sets in 
order to get a model for homoclinic orbits. The goal of 
the theory was to cover a whole dense open set of all 
dynamical systems. 

An invariant compact set K is hyperbolic if the 
tangent space TM|, of M over K splits as the direct 
sum TMy =E; $ E" of two DF-invariant vector 
bundles, where the vectors in Es and E" are 
uniformly contracted and expanded, respectively, 
by F”, for some n > 0. Hyperbolic sets persist 
under small C!-perturbations of the dynamics: any 
diffeomorphism G which is C'-close enough to F 
admits a hyperbolic compact set Kc close to K and 
the restrictions of F and G to K and Kg are 
conjugated by a homeomorphism close to the 
identity. Hyperbolic compact sets have well- 
defined invariant (stable and unstable) manifolds, 
tangent (at the points of K) to E? and E" and the 
(local) invariant manifolds of Kg vary locally 
continuously with G. 

The existence of hyperbolic sets is very common: 
if y is a transverse homoclinic point associated to a 
hyperbolic periodic point x, then there is a transitive 
hyperbolic set containing x and y. 

Diffeomorphisms for which R(F) is hyperbolic 
are now well understood: the chain recurrence 
classes are homoclinic classes, finitely many, and 
transitive, and admit a combinatorical model 
(subshift of finite type). Some of them are 
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attractors or repellers, and the basins of the 
attractors cover a dense open subset of M. If, 
furthermore, all the stable and unstable manifolds 
of points in R(f) are transverse, the diffeomorph- 
ism is Cl-structurally stable (Robbin 1971, 
Robinson 1976); indeed, this condition, called 
“axiom A + strong transversality,” is equivalent 
to the C'-structural stability (Mané 1988). 

In 1970, Abraham and Smale built examples of 
robustly non-axiom A diffeomorphisms, when 
dim M > 3: the dream of a global understanding 
of dynamical systems was postponed. However, 
hyperbolicity remains a key tool in the study of 
dynamical systems, even for  nonhyperbolic 
systems. 


C’-Generic Systems 
Kupka-Smale Theorem 


Thom’s transversality theorem asserts that two 
submanifolds can always be put in tranverse posi- 
tion by a C’-small perturbations. Hence, for F in an 
open and dense subset of Diff' (M), r > 1, the graph 
of F in MxM is transverse to the diagonal 
A= ((x,x),x € M): F has finitely many fixed points 
x;, depending locally continuously on F, and 1 is not 
an eigenvalue of the differential DF(x;). Small local 
perturbations in the neighborhood of the x; avoid 
eigenvalue of modulus equal to 1: one gets a dense 
and open subset O; of Diff (M) such that every fixed 
point is hyperbolic. This argument, adapted for 
periodic points, provides a dense and open set O’, C 
Diff' (M), such that every periodic point of period 7 
is hyperbolic. Now f), ©% is a residual subset of 
Diff(M), for which every periodic point is 
hyperbolic. 

Similarly, the set of diffeomorphisms Fe 
(No (M) such that all the disks of size n, of 
invariant manifolds of periodic points of period less 
that n, are pairwise transverse, is open and dense. 
One gets the Kupka-Smale theorem (see Palis and de 
Melo (1982) for a detailed exposition): for C'-generic 
diffeomorphisms F € Diff (M), every periodic orbit is 
hyperbolic and W*(x) is transverse to W"(y) for 
x, y € Per(F). 


Generic Properties of Low-Dimensional Systems 


Poincaré-Denjoy theory describes the topological 
dynamics of all diffeomorphisms of the circle $! (see 
Homeomorphisms and Diffeomorphisms of the 
Circle)J. Diffeomorphisms in an open and dense 
subset of Diff, (S!) have a nonempty finite set of 
periodic orbits, all hyperbolic, and alternately 
attracting (sink) or repelling (source). The orbit of 


a nonperiodic point comes from a source and goes 
to a sink. Two C’-generic diffeomorphisms of S! are 
conjugated iff they have same rotation number and 
same number of periodic points. 

This simple behavior has been generalized in 1962 
by Peixoto for vector fields on compact orientable 
surfaces S. Vector fields X in a C’-dense and open 
subset are Morse-Smale, hence structurally stable 
(see Palis and de Melo (1982) for a detailed proof). 
Peixoto gives a complete classification of these 
vector fields, up to topological equivalence. 

Peixoto's argument uses the fact that the return 
maps of the vector field on transverse sections are 
increasing functions: this helped control the effect 
on the dynamics of small *monotonous" perturba- 
tions, and allowed him to destroy any nontrivial 
recurrences. Peixoto's result remains true on non- 
orientable surfaces for the C'-topology but remains 
an open question for r> 1: is the set of Morse- 
Smale vector fields C?-dense, for S nonorientable 
closed surface? 


Local C?-Genericity of Wild Behavior for Surface 
Diffeomorphisms 


The generic systems we have seen above have a very 
simple dynamics, simpler than the general systems. 
This is not always the case. In the 1970s, Newhouse 
exhibited a C?-open set OC Diff^(S?) (where S? 
denotes the two-dimensional sphere), such that 
C?-generic diffeomorphisms F € © have infinitely 
many hyperbolic periodic sinks. In fact, C?-generic 
diffeomorphisms in © present many other patholo- 
gical properties: for instance, it has been recently 
noted that they have uncountably many chain 
recurrence classes without periodic orbits. Densely 
(but not generically) in O, they present many other 
phenomena, such as strange (Henon-like) attractors 
(see Lyapunov Exponents and Strange Attractors). 
This phenomenon appears each time that a 
diffeomorphism Fo admits a hyperbolic periodic 
point x whose invariant manifolds W^(x) and 
W"(x) are tangent at some point p € W^(x)n 
W"(x) (p is a homoclinic tangency associated to x). 
Homoclinic tangencies appear locally as a codimen- 
sion-1 submanifold of Diff” (S2); they are such a 
simple phenomenon that they appear in very natural 
contexts. When a small perturbation transforms the 
tangency into tranverse intersections, a new hyper- 
bolic set K with very large fractal dimensions is 
created. The local stable and unstable manifolds of 
K, each homeomorphic to the product of a Cantor 
set by a segment, present tangencies in a C?-robust way, 
that is, for F in some C?-open set Ó (see Figure 2). 
As a consequence, for a C?-dense subset of O, the 
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Figure 2 Robust tangencies. 


invariant manifolds of the point x present some 
tangency (this is not generic, by Kupka-Smale 
theorem). If the Jacobian of F at x is «1, each 
tangency allows to create one more sink, by an 
arbitrarily small perturbation. Hence, the sets of 
diffeomorphisms having more than hyperbolic 
sinks are dense open subsets of ©, and the 
intersection of all these dense open subsets is the 
announced residual set. See Palis and Takens 
(1993) for details on this deep argument. 


C'-Generic Systems: Global Dynamics 
and Periodic Orbits 


See Bonatti et al. (2004), Chapter 10 and Appendix A, 
for a more detailed exposition and precise 
references. 


Perturbations of Orbits: Closing and Connecting 
Lemmas 


In 1968, Pugh proved the following Lemma. 


Closing lemma If x is a nonwandering point of a 
diffeomorphism F, then there are diffeomorphisms 
G arbitrarily C'-close to F, such that x is periodic 
for G. 


Consider a segment xo,...,x, — F"(xo) of orbit 
such that x, is very close to xo —x; one would like 
to take G close to F such that G(x,)— xo, and 
G(x;) =F(x;)=xj11 for i#n. This idea works for 
the C?-topology (so that the C?-closing lemma is 
easy). However, if one wants G e-C!-close to F, one 
needs that the points x;,i € (1,...,7 — 1), remain at 
distance d(xj,xo) greater than C(d(x,, xo)/2), where 
C bounds ||Df|| on M. If C/e is very large, such a 
segment of orbit does not exist. Pugh solved this 
difficulty in two steps: the perturbation is first 
spread along a segment of orbit of x in order to 
decrease this constant; then a subsegment yo,..., y; 
Of xo,...,x, is selected, verifying the geometrical 
condition. 
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For the C? topology, the distances d(x;, xo) need 
to remain greater than 4/d(x,,xo)/& > d(xn, xo). 
This new difficulty is why the C?-closing lemma 
remains an open question. 

Pugh's argument does not suffice to create 
homoclinic point for a periodic orbit whose unstable 
manifold accumulates on the stable one. In 1998, 
Hayashi solved this problem proving the 


Connecting lemma (Hayashi 1997) Let y and z be 
two points such that tbe forward orbit of y and the 
backward orbit of z accumulate on tbe same 
nonperiodic point x. Fix some £ > 0. There is N > 
0 and a e-C'-perturbation G of F such that G"(y) =z 
for some n > 0, and G = F out of an arbitrary small 
neighborhood of {x, F(x), ... , FN(x)]. 


Using Hayashi's arguments, we (with Crovisier) 
proved the following lemma: 


Connecting lemma for pseudo-orbits (Bonatti and 
Crovisier 2004) Assume tbat all periodic orbits of F 
are hyperbolic; consider x,y € M such that, for any 
e > 0, there are e-pseudo-orbits joining x to y; then 
there are arbitrarily small C'-perturbations of F for 
which tbe positive orbit of x passes through y. 


Densities of Periodic Orbits 


As a consequence of the perturbations lemma above, 
we (Bonatti and Crovisier 2004) proved that for 
F C!-generic, 


R(F) = Q(F) = Pers, (F) 


where Per,,,(F) denotes the closure of the set of 
hyperbolic periodic points. 

For this, consider the map Y: F — W(F) = Perg, (F) 
defined on Diff! (M) and with value in K(M), space 
of all compact subsets of M, endowed with the 
Hausdorff topology. Pery,,(F) may be approximated 
by a finite set of hyperbolic periodic points, and this 
set varies continuously with F; so Perhyp(F) varies 
lower-semicontinuously with F: for G very close to 
F, Pery,,(G) cannot be very much smaller than 
Perjyp(F). As a consequence, a result from general 
topology asserts that, for C'-generic F, the map V is 
continuous at F. On the other hand, C!-generic 
diffeomorphisms are Kupka-Smale, so that the 
connecting lemma for pseudo-orbits may apply: 
if x € R(F),x can be turned into a hyperbolic 
periodic point by a C!-small perturbation of F. So, 
if x ¢ Per, (F), F is not a continuity point of V, 
leading to a contradiction. 

Furthermore, Crovisier proved the following 
result: “for C'-generic diffeomorphisms, each chain 
recurrence class is the limit, for the Hausdorff 
distance, of a sequence of periodic orbits.” 
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This good approximation of the global dynamics 
by the periodic orbits will now allow us to 
better understand the chain recurrence classes of 
C!-generic diffeomorphisms. 


Chain Recurrence Classes/Homoclinic Classes 
of C'-Generic Systems 


Tranverse intersections of invariant manifolds of 
hyperbolic orbits are robust and vary locally 
continuously with the diffeomorphisms F. So, the 
homoclinic class H(x) of a periodic point x varies 
lower-semicontinuously with F (on the open set 
where the continuation of x is defined). As a 
consequence, for C'-generic diffeomorphisms (r > 
1), each homoclinic class varies continuously with F. 
Using the connecting lemma, Arnaud (2001) proved 
the following result: “for Kupka-Smale diffeo- 
morphisms, if the closures W", (x) and W^, (x) 
have some intersection point z, then a C!-pertuba- 
tion of F creates a tranverse intersection of W^ (x) 
and W*. (x) at z.” So, if z H(x), then F is not a 
continuity point of the function. F — H(x, F). Hence, 
for C'-generic diffeomorphisms F and for every 
periodic point x, 


H(x) = Wi,(x)n Wo p(X) 


orb 


In the same way, Wu (x) and WS, (x) vary locally 
lower-semicontinuously with F so that, for F 
C'-generic, the closures of the invariant manifolds 
of each periodic point vary locally continuously. For 
Kupka-Smale  diffeomorphisms, the connecting 
lemma for pseudo-orbits implies: “if z is a point in 
the chain recurrence class of a periodic point x, then 
a C!-small perturbation of F puts z on the unstable 
manifold of x”; so, if z € W! (x), then F is not a 
continuity point of the function 下 一 W" (x, F). 
Hence, for C!-generic diffeomorphisms F and for 
every periodic point x, the chain recurrence class of 
x is contained in W",(x)M Ws , (x), and, therefore, 
coincides with the homoclinic class of x. This 
argument proves: 


For a C'-generic diffeomorphism F, each homoclinic class 
H(x) is a chain recurrence class of F (of Conley’s theory): 
a chain recurrence class containing a periodic point x 
coincides with the homoclinic class H(x). In particular, 
two homoclinic classes are either disjoint or equal. 


Tame and Wild Systems 


For generic diffeomorphisms, the number N(F) € 
N U {oo} of homoclinic classes varies lower-semicon- 
tinuously with F. One deduces that N(F) is locally 
constant on a residual subset of Diff'(M) (Abdenur 
2003). 


A local version (in the neighborhood of a chain 
recurrence class) of this argument shows that, for 
Cl-generic diffeomorphisms, any isolated chain 
recurrence classe C is robustly isolated: for any 
diffeomorphism G, C'-close enough to F, the 
intersection of R(G) with a small neighborhood of 
C is a unique chain recurrence class Cg close to C. 

One says that a diffeomorphism is *tame" if each 
chain recurrence class is robustly isolated. We 
denote by 7(M) c Diff'(M) the (Cl-open) set of 
tame diffeomorphisms and by W(M) the comple- 
ment of the closure of 7(M). Cl-generic diffeo- 
morphisms in W(M) have infinitely many disjoint 
homoclinic ‘classes, and are called “wild” 
diffeomorphisms. 

Generic tame diffeomorphisms have a global 
dynamics analogous to hyperbolic systems: the 
chain recurrence set admits a partition into finitely 
many homoclinic classes. varying continuously with 
the dynamics. Every point belongs to the stable set 
of one of these classes. Some of the homoclinic 
classes are (transitive) topological attractors, and the 
union of the basins covers a dense open subset of M, 
and the basins vary continuously with F (Carballo 
Morales 2003). It remains to get a good description 
of the dynamics in the homoclinic classes, and 
particularly in the attractors. As we shall see in the 
next section, tame behavior requires some kind of 
weak hyperbolicity. Indeed, in dimension 2, tame 
diffeomorphisms satisfy axiom A and the noncycle 
condition. 

As of now, very little is known about wild 
systems. One knows some semilocal mechanisms 
generating locally C'-generic wild dynamics, there- 
fore proving their existence on any manifold with 
dimension dim (M) >3 (the existence of wild diffeo- 
morphisms in dimension 2, for the C'-topology, 
remains an open problem). Some of the known 
examples exhibit a universal dynamics: they admit 
infinitely many disjoint periodic disks such that, up 
to renormalization, the return maps on these disks 
induce a dense subset of diffeomorphisms of the 
disk. Hence, these locally generic diffeomorphisms 
present infinitely many times any robust property of 
diffeomorphisms of the disk. 


Ergodic Properties 


A point x is well closable if, for any & > 0 there is 
G e-C!-close to F such that x is periodic for G and 
d(F(x),G'(x) «e for i€(0,...,p), p being the 
period of x. As an important refinement of Pugh's 
closing lemma, Mané proved the following lemma: 


Ergodic closing lemma For any F-invariant prob- 
ability, almost every point is well closable. 
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As a consequence, “for C!-generic diffeomoph- 
isms, any ergodic measure J is the weak limit of a 
sequence of Dirac measures on periodic orbits, 
which converges also in the Hausdorff distance to 
the support of u.” 

It remains an open problem to know if, for 
C!-generic diffeomorphisms, the ergodic measures 
supported in a homoclinic class are approached by 
periodic orbits in this homoclinic class. 


Conservative Systems 


The connecting lemma for pseudo-orbits has been 
adapted for volume preserving and symplectic 
diffeomorphisms, replacing the condition on the 
periodic orbits by another generic condition on the 
eigenvalues. As a consequence, one gets: “C'-generic 
volume-preserving or symplectic diffeomorphisms 
are transitive, and M is a unique homoclinic class.” 

Notice that the KAM theory implies that this 
result is wrong for C*-generic diffeomorphisms, the 
persistence of invariant tori allowing to break 
robustly the transitivity. 

The Oxtoby-Ulam (1941) theorem asserts that 
C?-generic volume-preserving homeomorphisms are 
ergodic. The ergodicity of C'-generic volume- 
preserving diffeomorphisms remains an open question. 


Hyperbolic Properties of C'-Generic 
Diffeomorphisms 


For a more detailed exposition of hyperbolic proper- 
ties of Cl-generic diffeomorphisms, the reader is 
referred to Bonatti et al. (2004, chapter 7 and 
appendix B). 


Perturbations of Products of Matrices 


The C!-topology enables us to do small perturbations of 
the differential DF at a point x without perturbing either 
F(x) or F out of an arbitrarily small neighborhood of x. 
Hence, one can perturb the differential of F along a 
periodic orbit, without changing this periodic orbit 
(Frank's lemma). When x is a periodic point of period n, 
the differential of F" at x is fundamental for knowing the 
local behavior of the dynamics. This differential is (up to 
a choice of local coordinates) a product of the matrices 
DF(x;), where x;—F'(x) So, the control of the 
dynamical effect of local perturbations along a periodic 
orbit comes from a problem of linear algebra: *consider 
a product A= A, 0 A, 10---0 A; of n > 0 bounded 
linear ismorphisms of R^; how do the eigenvalues and 
the eigenspaces of .A vary under small perturbations of 
the A;?" 

A partial answer to this general problem uses 
the notion of dominated splitting. Let X C M be an 


F-invariant set such that the tangent space of M at 
the points x € X admits a DF-invariant splitting 
T«(M) = Ei(x) © --- E(x), the dimensions dim (E;(x)) 
being independent of x. This splitting is dominated if 
the vectors in E;,4 are uniformly more expanded than 
the vectors in E; there exists / > 0 such that, for 
any x € X, any i € {1,..., — 1} and any unit vectors 
u € E;(x) and v € Ej,4(x), one has 


DF (ol < 3 DF v)]| 


Dominated splittings are always continuous, 
extend to the closure of X, and persist and vary 
continuously under C'-perturbation of F. 


Dominated Splittings versus Wild Behavior 


Let [5;] be a set of hyperbolic periodic orbits. On 
X-—|J4; one considers the natural splitting 
TM|y — E* @ E" induced by the hyperbolicity of the 
^, Mané (1982) proved: “if there is a C!-neighbor- 
hood of F on which each ^; remains hyperbolic, then 
the splitting TM|, = Es @ E" is dominated.” 

A generalization of Mafié's result shows: “if a 
homoclinic class H(x) has no dominated splitting, 
then for any £ > 0 there is a periodic orbit y in H(x) 
whose derivative at the period can be turned into an 
homothety, by an e-small perturbation of the 
derivative of F along the points of y”; in particular, 
this periodic orbit can be turned into a sink or a 
source. As a consequence, one gets: “for Cl-generic 
diffeomorphisms F, any homoclinic class either has a 
dominated splitting ,or is contained in the closure of 
the (infinite) set of sinks and sources." 

This argument has been used in two directions: 


è Tame systems must satisfy some hyperbolicity. In 
fact, using the ergodic closing lemma, one proves 
that the homoclinic classes H(x) of tame diffeo- 
morphisms are volume hyperbolic, that is, there is 
a dominated splitting TM — E, ®---@E, over 
H(x) such that DF contracts uniformly the 
volume in E; and expands uniformly the volume 
in Ep. 

e If F admits a homoclinic class H(x) which is 
robustly without dominated splittings, then gen- 
eric diffeomorphisms in the neighborhood of F are 
wild: at this time this is the unique known way to 
get wild systems. 


See also: Cellular Automata; Chaos and Attractors; 
Fractal Dimensions in Dynamics; Homeomorphisms and 
Diffeomorphisms of the Circle; Homoclinic Phenomena; 
Hyperbolic Dynamical Systems; Lyapunov Exponents 
and Strange Attractors; Polygonal Billiards; Singularity 
and Bifurcation Theory; Synchronization of Chaos. 
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Geometric analysis can be said to originate in the 
nineteenth century work of Weierstrass, Riemann, 
Schwarz, and others on minimal surfaces, a problem 
whose history can be traced at least as far back as 
the work of Meusnier and Lagrange in the eight- 
eenth century. The experiments performed by 
Plateau in the mid-19th century, on soap films 
spanning wire contours, served as an important 
inspiration for this work, and led to the formulation 
of the Plateau problem, which concerns the exis- 
tence and regularity of area-minimizing surfaces in 
R? spanning a given boundary contour. The Plateau 
problem for area-minimizing disks spanning a curve 
in R? was solved by J Douglas (who shared the first 
Fields medal with Lars V Ahlfors) and T Rado in the 
1930s. Generalizations of Plateau’s problem have 
been an important driving force behind the devel- 
opment of modern geometric analysis. Geometric 
analysis can be viewed broadly as the study of 
partial differential equations arising in geometry, 
and includes many areas of the calculus of varia- 
tions, as well as the theory of geometric evolution 
equations. The Einstein equation, which is the 
central object of general relativity, is one of the 
most widely studied geometric partial differential 
equations, and plays an important role in its 
Riemannian as well as in its Lorentzian form, the 
Lorentzian being most relevant for general relativity. 


The Einstein equation is the Euler-Lagrange 
equation of a Lagrangian with gauge symmetry 
and thus in the Lorentzian case it, like the Yang- 
Mills equation, can be viewed as a system of 
evolution equations with constraints. After imposing 
suitable gauge conditions, the Einstein equation 
becomes a hyperbolic system, in particular using 
spacetime harmonic coordinates (also known as 
wave coordinates), the Einstein equation becomes a 
quasilinear system of wave equations. The con- 
straint equations implied by the Einstein equations 
can be viewed as a system of elliptic equations in 
terms of suitably chosen variables. Thus, the 
Einstein equation leads to both elliptic and hyper- 
bolic problems, arising from the constraint equa- 
tions and the Cauchy problem, respectively. The 
groundwork for the mathematical study of the 
Einstein equation and the global nature of space- 
times was laid by, among others, Choquet-Bruhat, 
who proved local well-posedness for the Cauchy 
problem, Lichnerowicz, and later York who pro- 
vided the basic ideas for the analysis of the 
constraint equations, and Leray who formalized the 
notion of global hyperbolicity, which is essential for 
the global study of spacetimes. An important frame- 
work for the mathematical study of the Einstein 
equations has been provided by the singularity 
theorems of Penrose and Hawking, as well as the 
cosmic censorship conjectures of Penrose. 

Techniques and ideas from geometric analysis 
have played, and continue to play, a central role in 
recent mathematical progress on the problems posed 
by general relativity. Among the main results are the 


proof of the positive mass theorem using the 
minimal surface technique of Schoen and Yau, and 
the spinor-based approach of Witten, as well as the 
proofs of the (Riemannian) Penrose inequality by 
Huisken and Illmanen, and Bray. The proof of the 
Yamabe theorem by Schoen has played an important 
role as a basis for constructing Cauchy data using 
the conformal method. 

The results just mentioned are all essentially 
Riemannian in nature, and do not involve study of 
the Cauchy problem for the Einstein equations. 
There has been great progress recently concerning 
global results on the Cauchy problem for the 
Einstein equations, and the cosmic censorship con- 
jectures of Penrose. The results available so far are 
either small data results (among these the nonlinear 
stability of Minkowski space proved by Christodoulou 
and Klainerman) or assume additional symmetries, 
such as the recent proof by Ringstróm of strong 
cosmic censorship for the class of Gowdy space- 
times. However, recent progress concerning quasi- 
linear wave equations and the geometry of 
spacetimes with low regularity due to, among 
others, Klainerman and Rodnianski, and Tataru 
and Smith, appears to show the way towards an 
improved understanding of the Cauchy problem for 
the Einstein equations. 

Since the constraint equations, the Penrose 
inequality and the Cauchy problem are discussed 
in separate articles, the focus of this article will be 
on the role in general relativity of “critical” and 
other geometrically defined submanifolds and folia- 
tions, such as minimal surfaces, marginally trapped 
surfaces, constant mean curvature hypersurfaces 
and null hypersurfaces. In this context it would be 
natural also to discuss geometrically defined flows 
such as mean curvature flows, inverse mean 
curvature flow, and Ricci flow. However, this 
article restricts the discussion to mean curvature 
flows, since the inverse mean curvature flow 
appears naturally in the context of the Penrose 
inequality and the Ricci flow has so far mainly 
served as a source of inspiration for research on the 
Einstein equations rather than an important tool. 
Other topics which would fit well under the 
heading “General relativity and geometric analysis” 
are spin geometry (the Witten proof of the Positive 
mass theorem), the Yamabe theorem and related 
results concerning the Einstein constraint equa- 
tions, gluing and other techniques of “spacetime 
engineering." These are all discussed in other 
articles. Some techniques which have only recently 
come into use and for which applications in general 
relativity have not been much explored, such as 
Cheeger-Gromov compactness, are not discussed. 
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Minimal and Related Surfaces 


Consider a hypersurface N in Euclidean space R" 
which is a graph x, — u(x1,...,x, 1) with respect to 
the function u. The area of N is given by 


A(N) = [V1+|Dul dx! ---dx"-!. N is stationary 
with respect to A if u satisfies the equation 


D; Dm =0 [1] 


. 2 V 1+ IDu| 


A hypersurface N defined as a graph of u solving 
[1] minimizes area with respect to compactly sup- 
ported deformations, and hence is called a minimal 
surface. For n < 7, a solution to eqn [1] defined on 
all of R”! must be an affine function. This fact is 
known as a Bernstein principle. Equation [1], and 
more generally, the prescribed mean curvature 
equation which will be discussed below, is a quasi- 
linear, uniformly elliptic second-order equation. The 
book by Gilbarg and Trudinger (1983) is an excellent 
general reference for such equations. 

The theory of rectifiable currents, developed by 
Federer and Fleming, is a basic tool in the modern 
approach to the Plateau problem and related varia- 
tional problems. A rectifiable current is a countable 
union of Lipschitz submanifolds, counted with integer 
multiplicity, and satisfying certain regularity condi- 
tions. Hausdorff measure gives a notion of area for 
these objects. One may therefore approach the study of 
minimal surfaces via rectifiable currents which are 
stationary with respect to variations of area. Suitable 
generalizations of familiar notions from smooth 
differential geometry such as tangent plane, normal 
vector, extrinsic curvature can be introduced. The 
book by Federer (1969) is a classic treatise on the 
subject. Further information concerning minimal sur- 
faces and related variational problems can be found in 
Lawson, Jr. (1980) and Simon (1997). Note, however, 
that unless otherwise stated, all fields and manifolds 
considered in this article are assumed to be smooth. For 
the Plateau problem in a Riemannian ambient space, 
we have the following existence and regularity result. 


Theorem 1 (Existence of embedded solutions for 
Plateau problem). Let M be a complete Riemannian 
manifold of dimension n < 7 and let T be a compact 
(n — 2)-dimensional submanifold in M wbich bounds. 
Then tbere is an (n — 1)-dimensional area-minimizing 
hypersurface N with T as its boundary. N is a smooth, 
embedded manifold in its interior. 


If the dimension of the ambient space is »7, 
solutions to the Plateau problem will in general have 
a singular set of dimension z —8. Let N be an 
oriented hypersurface of a Riemannian manifold M 
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with covariant derivative D. Let 7 be the unit 
normal of N and define the second fundamental 
form and mean curvature of N by Aj = (Dz,1, e;) 
and H=trA. Define the action functional 
E(N)=A(N) — fun Ho, where Ho is a function 
defined on M, and /,,,, denotes the integral over 
the volume bounded by N in M. The problem of 
minimizing € is a useful generalization of the 
minimization problem for .A. 


Theorem 2 (Existence of minimizers in homology). 
Let M be a compact Riemannian manifold of 
dimension <7, and let a be an integral homology 
class on M of codimension 1. Then there is a smooth 
minimizer for € representing |a]. 


Again, in higher dimensions, the minimizers will 
in general have singularities. The general form of 
this result deals with elliptic functionals. For 
surfaces in 3-manifolds, the problem of minimizing 
area within homotopy classes has been studied. 
Results in this direction played a central role in the 
approach of Schoen and Yau to manifolds with non- 
negative scalar curvature. 

If M is not compact, it is in general necessary to use 
barriers to control the minimizers, or consider some 
version of the Plateau problem. Barriers can be used due 
to the strong maximum principle, which holds for the 
mean curvature operator since it is quasilinear elliptic. 
Consider two hypersurfaces N;, Na which intersect at a 
point p and assume that Ni lies on one side of Nz with 
the normal pointing towards N1. If the mean curvatures 
Hı, Hə of the hypersurfaces, defined with respect to 
consistently oriented normals, satisfy Hı < à < Hə for 
some constant A, then Ni and N; coincide near p and 
have mean curvatures equal to A. This result requires 
only mild regularity conditions on the hypersurfaces. 
Generalizations hold also for the case of spacelike or 
null hypersurfaces in a Lorentzian ambient space, see 
Andersson et al. (1998) and Galloway (2000). 

Let ó be a smooth compactly supported function 
on N. The variation E =E of E€ under a 
deformation @7 is 


&= J st - Ho) 


Thus, N is stationary with respect to € if and only if 
N solves the prescribed mean curvature equation 
H(x)—- Ho(x) for x € N. Supposing that N is 
stationary and Ho is constant, the second variation 
E" = fE of E is of the form 


g- 人 e(Jó) 


where / is the second-variation operator, a second- 
order elliptic operator. A calculation, using the 


Gauss equation and the second-variation equation 
shows 


]ó = —^uó — +} [(Scalm — Scaly) + H^ + Allo [2] 


where An, Scaly, Scaly denote the Laplace-Beltrami 
operator of N, and the scalar curvatures of M and 
N, respectively. If / is positive semidefinite, N is 
called stable. 

To set the context where we will apply the 
above, let (M,gj;) be a connected, asymptotically 
Euclidean three-dimensional Riemannian manifold 
with covariant derivative, and let k; be a symmetric 
tensor on M. Suppose (M,g;j,K;) is imbedded 
isometrically as a spacelike hypersurface in a space- 
time (V, Yag) with -gj;K; the first and second 
fundamental forms induced on M from V, in 
particular K; — (D, T,ej where T is the timelike 
normal of M in the ambient spacetime V, and D is 
the ambient covariant derivative. We will refer to 
(M, gi, Ki) as a Cauchy data set for the Einstein 
equations. Although many of the results which will 
be discussed below generalize to the case of a 
nonzero cosmological constant A, we will discuss 
only the case A=0 in this article. Gag = Ricy,5 一 
(1/2)Scalyyag be the Einstein tensor of V, and let 
p— Gag T*T?,u;— G;, T^. Then the fields (gj, Ky) 


satisfy the Einstein constraint equations 
R+tr K? —|K|* — 2p [3] 
V jtr K — V'K;; — Hj [4] 


We assume that the dominant energy condition 
(DEC) 


1/2 


p>( Yui’) [5] 


holds. We will sometimes make use of the null 
energy condition (NEC), G,5L^L^ >0 for null 
vectors L, and the strong energy condition (SEC), 
Ricy,4v^v^ > 0 for causal vectors v. M will be 
assumed to satisfy the fall-off conditions 


fs (1 +=) bi 十 O(1/r’) [6a] 


K; = O(1/r^) [6b] 


as well as suitable conditions for the fall-off of deriva- 
tives of gj, K;. Here m is the ADM (Arnowitt, Deser, 
Misner) mass of (M, g;, Kij). 


Minimal Surfaces and Positive Mass 


Perhaps the most important application of the theory 
of minimal surfaces in general relativity is in the 


Schoen-Yau proof of the positive-mass theorem, 
which states that m > 0, and 1z 0 only if (M,g, K) 
can be embedded as a hypersurface in Minkowski 
space. Consider an asymptotically Euclidean manifold 
(M,g) with g satisfying [6a] and with non-negative 
scalar curvature. By using Jang's equation, see below, 
the general situation is reduced to the case of a time 
symmetric data set, with K = 0. In this case, the DEC 
implies that (M, g) has non-negative scalar curvature. 

Assuming m «0 one may, after applying a 
conformal deformation, assume that Scal > 0 in 
the complement of a compact set. Due to the 
asymptotic conditions, level sets for sufficiently 
large values of one of the coordinate functions, say 
x^, can be used as barriers for minimal surfaces in 
M. By solving a sequence of Plateau problems with 
boundaries tending to infinity, a stable entire 
minimal surface N homeomorphic to the plane is 
constructed. Stability implies using [2], 


ar L. d 
= — ES - < 
[G Scalm “+3lAP) «o 


where «= (1/2)Scaly is the Gauss curvature of N. 
Since by construction Scaly > 0,Scaly > 0 outside a 
compact set, this gives [4 & > 0. Next, one uses the 
identity, related to the Cohn-Vossen inequality 


where A; L; are the area and circumference of a 
sequence of large discs. Estimates using the fact 
that M is asymptotically Euclidean show that 
lim; (L2/2A;) > 27 which gives a contradiction and 
shows that the minimal surface constructed cannot 
exist. It follows that > 0. It remains to show that the 
case m=O is rigid. To do this proves that for an 
asymptotically Euclidean metric with non-negative 
scalar curvature, which is positive near infinity, there 
is a conformally related metric with vanishing scalar 
curvature and strictly smaller mass. Applying this 
argument in case m=0 gives a contradiction to the 
fact that m > 0. Therefore, m=0 only. if the scalar 
curvature vanishes identically. Suppose now that (M, g) 
has vanishing scalar curvature but nonvanishing Ricci 
curvature Ricy. Then using a deformation of g in the 
direction of Ricy, one constructs a metric close to g 
with negative mass, which leads to a contradiction. 

This technique generalizes to Cauchy surfaces of 
dimension n < 7. The proof involves induction on 
dimension. For n 7 minimal hypersurfaces are 
singular in general and this approach runs into 
problems. The Witten proof using spinor techniques 
does not suffer from this limitation but instead 
requires that M be spin. 
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Marginally Trapped Surfaces 


Consider a Cauchy data set (M, gj, Kij) as above and 
let N be a compact surface in M with normal n, 
second fundamental form A and mean curvature H. 
Then considering N as a surface in an ambient 
Lorentzian space V containing M, N has two null 
normal fields which after a rescaling can be taken to 
be Li — T +n. Here, T is the future-directed time- 
like unit normal of M in V. The null mean 
curvatures (or null expansions) corresponding to 
L+ can be defined in terms of the variation of the 
area element jin of N as ór, in — Ó&jiN OF 


0.. 一 tr. K TH 


where try K denotes the trace of the projection of 
Ki to N. Suppose L, is the outgoing null normal. 
N is called outer trapped (marginally trapped, 
untrapped) if 0, < 0(0, —0,0, > 0). An asymptoti- 
cally flat spacetime which contains a trapped surface 
with 0_ < 0,0, <0 is causally incomplete. In the 
following we will for simplicity drop the word outer 
from our terminology. 

Consider a Cauchy surface M. The boundary of 
the region in M containing trapped surfaces is, if it is 
sufficiently smooth, a marginally trapped surface. 
The equation 0, — 0 is an equation analogous to the 
prescribed curvature equation, in particular it is a 
quasilinear elliptic equation of second order. Mar- 
ginally trapped surfaces are not variational in the 
same sense as minimal surfaces. Nevertheless, they 
are stationary with respect to variations of area 
within the outgoing light cone. The second variation 
of area along the outgoing null cone is given, in view 
of the Raychaudhuri equation, by 


65,04. = —(G + le. [^)ó [7] 


for a function ó on N. Here G} = Gaal Li, and a, 
denotes the shear of N with respect to L, , that is, the 
tracefree part of the null second fundamental form 
with respect to L,. Equation [7] shows that the 
stability operator in the direction L. is not elliptic. 

In the case of time-symmetric data, K; —0, the 
DEC implies Scaly > 0 and marginally trapped 
surfaces are simply minimal surfaces. A stable 
compact minimal 2-surface N in a 3-manifold M 
with non-negative scalar curvature must satisfy 


Anx(N) = E 2 al Scaly + EY > 0 
N 


and hence by the Gauss-Bonnet theorem, N is 
diffeomorphic to a sphere or a torus. In case N is a 
stable minimal torus, the induced geometry is flat 
and the ambient curvature vanishes at N. If, in 
addition, N minimizes, then M is flat. 
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For a compact marginally trapped surface N in M, 
analogous results can be proved by studying the 
stability operator defined with respect to the direc- 
tion 7. Let J be the operator defined in terms of a 
variation of 0, by Jó— ô+. Then 


]ó ^ — Ano + 2s" DAÓ 


+ t Scaly — s4s^ + Das” -3 一 G+ o 
Here, s4 = —(1/2)(L , DAL.) and G+- is the Einstein 
tensor evaluated on L,,L_. We may call N stable if 
the real part of the spectrum of / is non-negative. A 
sufficient condition for N to be stable is that N is 
locally outermost. This can be formulated, for 
example, by requiring that a neighborhood of N in M 
contains no trapped surfaces exterior to N. In this case, 
assuming that the DEC holds, N is a sphere or a torus, 
and if the real part of the spectrum of / is positive then 
N is a sphere. If N is a torus, then the ambient 
curvature and shear vanishes at N, sa is a gradient, and 
N is flat. One expects that in addition, global rigidity 
should hold, in analogy with the minimal surface case. 
This is an open problem. If N satisfies the stronger 
condition of strict stability, which corresponds to the 
spectrum of J having positive real part, then N is in the 
interior of a hypersurface H of the ambient spacetime, 
with the property that it is foliated by marginally 
trapped surfaces (Andersson et al. 2005). If the NEC 
holds and N has nonvanishing shear, then H is 
spacelike at N. A hypersurface H with these proper- 
ties is known as a dynamical horizon. 


Jang's Equation 


Consider a Cauchy data set (M, gi, Kj). Extend Kj 
to a tensor field on M x R, constant in the vertical 
direction. Then the equation for a graph 


t = f (x)} 


such that N has mean curvature equal to the trace of 
the projection of K;; to N with respect to the induced 
metric on N, is given by 


E n Loo PRSE ee 
> 人 oem ) (s UA 0 [8] 


an equation closely related to the equation 
0, —0. Equation [8] was introduced by P S Jang 
(Jang 1978) as part of an attempt to generalize the 
inverse mean curvature flow method of Geroch from 
time-symmetric to general Cauchy data. 

Existence and regularity for Jang's equation were 
proved by Schoen and Yau (1981) and used to 


generalize their proof of the positive-mass theorem 
from the case of maximal slices to the general case. 
The solution to Jang's equation is constructed as the 
limit of the solution to a sequence of regularized 
problems. The limit consists of a collection N of 
submanifolds of M x R. In particular, component 
near infinity is a graph and has the same mass as M. 
N may contain vertical components which project 
onto marginally trapped surfaces in M, and in fact 
these constitute the only possibilities for blow-up of 
the sequence of graphs used to construct N. If the 
DEC is valid, the metric on N has non-negative 
scalar curvature in the weak sense that 


Scalyó? + 2/Vo[^ > 0 
JN 


for smooth compactly supported functions ó. If the 
DEC holds strictly, the strict inequality holds and in 
this case the metric on N is conformal to a metric 
with vanishing scalar curvature. 

Jang's equation can be applied to prove existence 
of marginally trapped surfaces, given barriers. Let 
(M, gij Kj) be a Cauchy data set containing two 
compact surfaces N4,N; which together bound a 
compact region M' in M. Suppose the surfaces Ni 
and N; have 0, <0 on Ni and 0, » 0 on N3. 
Schoen recently proved the following result. 


Theorem 3 (Existence of marginally trapped sur- 
faces). Let M', Ni, Nə be as above. Then there is a 
finite collection of compact, marginally trapped 
surfaces {£a} contained in the interior of M', such 
that US, is homologous to Nj. If the DEC holds, 
then X, is a collection of spheres and tori. 


The proof proceeds by solving a sequence of 
Dirichlet boundary-value problems for Jang's equation 
with boundary value on Ni, N2 tending to 一 co and oo, 
respectively. The assumption on 6, is used to show the 
existence of barriers for Jang’s equation. Let f, be the 
sequence of solutions to the Dirichlet problems. Jang's 
equation is invariant under renormalization f, — f, + 
ck for some sequence c; of real numbers. A Harnack 
inequality for the gradient of the solutions to Jang's 
equation is used to show that the sequence of solutions 
fz, possibly after a renormalization, has a subsequence 
converging to a vertical submanifold of M’ x R, which 
projects to a collection ©, of marginally trapped 
surfaces. By construction, the zero sets of the f, 
are homologous to Ni and N2. The estimates on 
the sequence {f}} show that this holds also in the limit 
k — oo. The statement about the topology of the X; 
follows by showing, using the above-mentioned 
inequality for Scaly, that if DEC holds, the total 
Gauss curvature of each surface X, is non-negative. 


Center of mass 


Since by the positive-mass theorem m > 0 unless the 
ambient spacetime is flat, it makes sense to consider 
the problem of finding an appropriate notion of 
center of mass. This problem was solved by Huisken 
and Yau who showed that under the asymptotic 
conditions [6] the isoperimetric problem has a unique 
solution if one considers sufficiently large spheres. 


Theorem 4 (Huisken and Yau 1996). There is an 
Ho > 0 and a compact region Bg, such that for each 
H € (0, Ho) there is a unique constant mean curva- 
ture sphere Sy with mean curvature H contained in 
M\By,. The spheres form a foliation. 


The proof involves a study of the evolution 
equation 


d = 
3.7 (H — H)q [9 


where H is the average mean curvature. This is the 
gradient flow for the isoperimetric problem of 
minimizing area keeping the enclosed volume con- 
stant. The solutions in Euclidean space are standard 
spheres. Equation [9] defines a parabolic system, in 
particular we have 


oH = AH + (Ric(n,q) + |A|*)(H — H) 

It follows from the fall-off conditions [6] that the 
foliation of spheres constructed in Theorem 4 are 
untrapped surfaces. They can therefore be used as 
outer barriers in the existence result for marginally 
trapped surfaces, (Theorem 3). 

The mean curvature flow for a spatial hypersur- 
face in a Lorentz manifold is also parabolic. This 
flow has been applied to construct constant mean 
curvature Cauchy hypersurfaces in spacetimes. 


Maximal and Related Surfaces 


Let N be the hypersurface xp =u(x4,...,Xn) in 
Minkowski space R'^" with line element —dx + 
dx? +---+dx2. Assume |Vu| <1 so that N is 
spacelike. Then N is stationary with respect to 
variations of area if u solves the equation 


V ju 


y vi| -=== [20 [10] 
i \/ 1 — |Vul? 


N maximizes area with respect to compactly 
supported variations, and hence is called a maximal 
surface. As in the case of the minimal surface 
equation, eqn [10] and more generally the 
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Lorentzian prescribed mean curvature equation, is 
quasilinear elliptic, but it is not uniformly elliptic, 
which makes the regularity theory more subtle. 

A Bernstein principle analogous to the one for the 
minimal surface equation holds for the maximal 
surface equation [10]. Suppose that u is a solution to 
[10] which is defined on all of R”. Then u is an 
affine function (Cheng and Yau 1976). An impor- 
tant tool used in the proof is a Bochner type identity, 
originally due to Calabi, for the norm of the second 
fundamental form. For a hypersurface in a flat 
ambient space, the Codazzi equation states V;A;, — 
V;Aj, — 0. This gives the identity 


AAj = ViVjH + ApmR” 5 + AmiRic” — [11] 


The curvature terms can be rewritten in terms of Aj; 
if the ambient space is flat. Using [11] to compute 
AJAJ? gives an expression which is quadratic in VA, 
and fourth order in |A|, and which allows one to 
perform maximum principle estimates on |A|. Gen- 
eralizations of this technique for hypersurfaces in 
general ambient spaces play an important role in the 
proof of regularity of minimal surfaces, and in the 
proof of existence for Jang’s equation as well as in 
the analysis of the mean curvature flow used to 
prove existence of round spheres. The generalization 
of eqn [11] is known as a Simons identity. 

For the case of maximal hypersurfaces of 
Minkowski space, it follows from further maximum 
principle estimates that a maximal hypersurface of 
Minkowski space is convex, in particular, it has 
nonpositive Ricci curvature. Generalizations of this 
technique allow one to analyze entire constant mean 
curvature hypersurfaces of Minkowski space. 

Consider a globally hyperbolic Lorentzian mani- 
fold (V,4). A C? hypersurface is said to be weakly 
spacelike if timelike curves intersect it in at most one 
point. Call a codimension-2 submanifold T C V a 
weakly spacelike boundary if it bounds a weakly 
spacelike hypersurface No. 


Theorem 5 (Existence for Plateau problem for 
maximal surfaces (Bartnik 1988)). Let V be a 
globally hyperbolic spacetime and assume that the 
causal structure of V is such that the domain of 
dependence of any compact domain in V is compact. 
Given a weakly spacelike boundary T in V, there is a 
weakly spacelike maximal hypersurface N with T as 
its boundary. N is smooth except possibly on null 
geodesics connecting points of Y. 


Here, maximal hypersurface is understood in a 
weak sense, referring to stationarity with respect to 
variations. Due to the nonuniform ellipticity for the 
maximal surface equation, the interior regularity 
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which holds for minimal surfaces fails to hold in 
general for the maximal surface equation. 

A time-oriented spacetime is said to have a crushing 
singularity to the past (future) if there is a sequence X, 
of Cauchy surfaces so that the mean curvature 
function H,, of X, diverges uniformly to —oc(ox). 


Theorem 6 (Gerhardt 1983). Suppose that (V,^) is 
globally hyperbolic with compact Cauchy surfaces 
and satisfies the SEC. Then if (V,4) bas crushing 
singularities to tbe past and future it is globally 
foliated by constant mean curvature hypersurfaces. 
The mean curvature 7 of these Cauchy surfaces is a 
global time function. 


The proof involves an application of results from 
geometric measure theory to an action € of the form 
discussed earlier. A barrier argument is used to control 
the maximizers. Bartnik (1984, theorem 4.1) gave a 
direct proof of existence of a constant mean curvature 
(CMC) hypersurface, given barriers. If the spacetime 
(V, y) is symmetric, so that a compact Lie group acts 
on V by isometries, then CMC hypersurfaces in V 
inherit the symmetry. Theorem 6 gives a condition 
under which a spacetime is globally foliated by CMC 
hypersurfaces. In general, if the SEC holds in a 
spatially compact spacetime, then for each r 40, 
there is at most one constant mean curvature Cauchy 
surface with mean curvature 7. In case V is vacuum, 
Ricy = 0, and 3 + 1 dimensional, then each point x € 
V is on at most one hypersurface of constant mean 
curvature unless V is flat and splits as a metric product. 

There are vacuum spacetimes with compact Cauchy 
surface which contain no CMC hypersurface 
(Chrusciel et al. 2004). The proof is carried out by 
constructing Cauchy data, using a gluing argument, on 
the connected sum of two tori, such that the resulting 
Cauchy data set (M, gj, Kj) has an involution which 
reverses the sign of Kj. The involution extends to the 
maximal vacuum development V of the Cauchy data 
set. Existence of a CMC surface in V gives, in view of 
the involution, barriers which allow one to construct a 
maximal Cauchy surface homeomorphic to M. This 
leads to a contradiction, since the connected sum of 
two tori does not carry a metric of positive scalar 
curvature, and therefore, in view of the constraint 
equations, cannot be imbedded as a maximal Cauchy 
surface in a vacuum spacetime. The maximal vacuum 
development V is causally geodesically incomplete. 
However, in view of the existence proof for CMC 
Cauchy surfaces (cf. Theorem 6), these spacetimes 
cannot have a crushing singularity. It would be 
interesting to settle the open question whether there 
are stable examples of this type. 

In the case of a spacetime V which has an 
expanding end, one does not expect in general that 


the spacetime is globally foliated by CMC hyper- 
surfaces even if V is vacuum and contains a CMC 
Cauchy surface. This expectation is based on the 
phenomenon known as the collapse of the lapse; for 
example, the Schwarzschild spacetime does not 
contain a global foliation by maximal Cauchy 
surtaces (Beig and Murchadha 1998). However, no 
counterexample is known in the spatially compact 
case. In spite of these caveats, many examples of 
spacetimes with global CMC foliations are known, 
and the CMC condition, or more generally pre- 
scribed mean curvature, is an important gauge 
condition for general relativity. 

Some examples of situations where global 
constant or prescribed mean curvature foliations 
are known to exist in vacuum or with some types of 
matter are spatially homogeneous spacetimes, 
and spacetimes with two commuting Killing fields. 
Small data global existence for the Einstein equa- 
tions with CMC time gauge have been proved for 
spacetimes with one Killing field, with Cauchy 
surface a circle bundle over a surface of genus >1, 
by Choquet-Bruhat and Moncrief. Further, for 
(3+ 1)-dimensional spacetimes with Cauchy 
surface admitting a hyperbolic metric, small data 
global existence in the expanding direction has been 
proved by Andersson and Moncrief. See Andersson 
(2004) and Rendall (2002) for surveys on the 
Cauchy problem in general relativity. 


Null Hypersurfaces 


Consider an asymptotically flat spacetime contain- 
ing a black hole, that is, a region B such that future 
causal curves starting in B cannot reach observers at 
infinity. The boundary of the trapped region is 
called the event horizon H. This is a null hypersur- 
face, which under reasonable conditions on causality 
has null generators which are complete to the future. 
Due to the completeness, assuming that H is 
smooth, one can use the Raychaudhuri equation 
[7] to show that the null expansion 6, of a spatial 
cross section of H must satisfy 0, > 0, and hence 
that the area of cross sections of grows mono- 
tonously to the future. A related statement is that 
null generators can enter H but may not leave it. 
This was first proved by Hawking for the case of 
smooth horizons, using essentially the Raychaudhuri 
equation. In general H can fail to be smooth. 
However, from the definition of H as the boundary 
of the trapped region it follows that it has support 
hypersurfaces, which are past light cones. This 
property allows one to prove that 74 is Lipschitz 
and hence smooth almost everywhere. At smooth 
points of H, the calculations in the proof of 


Hawking apply, and the monotonicity of the area of 
cross sections follows. 


Theorem 7 (Area theorem (Chrusciel et al. 2001)). 
Let H be a black hole event horizon in a smooth 
spacetime (M,g). Suppose tbat tbe generators are 
future complete and the NEC holds on H. Let 
$,,a = 1,2, be two spacelike cross sections of H and 
suppose that Sı is to the future of Sı. Then 
A(S2) > .A(S1). 


The eikonal equation V°uV,u=0 plays a central 
role in geometric optics. Level sets of a solution u are 
null hypersurfaces which correspond to wave fronts. 
Much of the recent progress on rough solutions to the 
Cauchy problem for quasilinear wave equations is 
based on understanding the influence of the geometry 
of these wave fronts on the evolution of high- 
frequency modes ‘in the background spacetime. In 
this analysis many objects familiar from general 
relativity, such as the structure equations for null 
hypersurfaces, the Raychaudhuri equation, and the 
Bianchi identities play an important role, together 
with novel techniques of geometric analysis used to 
control the geometry of cross sections of the wave 
fronts and to estimate the connection coefficients in a 
rough spacetime geometry. These techniques show 
great promise and can be expected to have a 
significant impact on our understanding of the 
Einstein equations and general relativity. 
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Introduction 


In a paper, R Penrose (1973) made a physical 
argument that the total mass of a spacetime which 
contains black holes with event horizons of total area 
A should be at least ,/A/167. An important special 
case of this physical statement translates into a very 
beautiful mathematical inequality in Riemannian 
geometry known as the Riemannian Penrose inequal- 
ity. The Riemannian Penrose inequality was first 
proved by Huisken and Ilmanen (1997) for a single 
black hole and then by the author in 1999 for any 
number of black holes. The two approaches use two 
different geometric flow techniques. The most general 
version of the Penrose inequality is still open. 

A natural interpretation of the Penrose inequality 
is that the mass contributed by a collection of 
black holes is (at least) 4/A/167. More generally, 
the question *How much matter is in a given region 
of a spacetime?" is still very much an open problem. 
(Christodoulou and Yau 1988). In this paper, we 
will discuss some of the qualitative aspects of mass 
in general relativity, look at examples which are 
informative, and describe the two very geometric 
proofs of the Riemannian Penrose inequality. 


Total Mass in General Relativity 


Two notions of mass which are well understood in 
general relativity are local energy density at a point 
and the total mass of an asymptotically flat space- 
time. However, defining the mass of a region larger 
than a point but smaller than the entire universe is 
not very well understood at all. 

Suppose (Mi,g) is a Riemannian 3-manifold 
isometrically embedded in a (3+ 1)-dimensional 
Lorentzian spacetime N*. Suppose that M? has zero- 
second fundamental form in the spacetime. This is a 
simplifying assumption which allows us to think of 
(MP, g) asa “t — 0" slice of the spacetime. (Recall that 
the second fundamental form is a measure of how 
much M? curves inside N^. M? is also sometimes 
called “totally geodesic” since geodesics of N* which 
are tangent to M? at a point stay inside M? forever.) 
The Penrose inequality (which allows for M? to have 
general second fundamental form) is known as the 


Riemannian Penrose inequality when the second 
fundamental form is set to zero. 

We also want to only consider (M?,g) that are 
asymptotically flat at infinity, which means that for 
some compact set K, the “end” M?XK is diffeo- 
morphic to R^XB,(0), where the metric g is 
asymptotically approaching (with certain decay 
conditions) the standard flat metric 6; on R? at 
infinity. The simplest example of an asymptotically 
flat manifold is (R?,6;;) itself. Other good examples 
are the conformal metrics (R°, u(x )*65), where u(x) 
approaches a constant sufficiently rapidly at infinity. 
(Also, sometimes it is convenient to allow (M^, g) to 
have multiple asymptotically flat ends, in which 
case each connected component of M*\K must 
have the property described above.) A qualitative 
picture of an asympotically flat 3-manifold is shown 
in Figure 1. 

The purpose of these assumptions on the asymp- 
totic behavior of (M3, g) at infinity is that they imply 
the existence of the limit 


m= ic; lim 人 2 BijaUi 一 giijVj) ) du 


where S, is the coordinate sphere of radius g, v is the 
unit normal to S,, and dy is the area element of S, in the 
coordinate chart. The quantity m is called the “total 
mass” (or ADM mass) of (M?, g) and does not depend 
on the choice of asymptotically flat coordinate chart. 

The above equation is where many people would 
stop reading an article like this. But before you do, 
we will promise not to use this definition of the total 
mass in this paper. In fact, it turns out that total mass 
can be quite well understood with an example. Going 
back to the example (R?, u(x)*6;), if we suppose that 
u(x) > 0 has the asymptotics at infinity 


u(x) = a + b/|x| + O(1/|x[") [1] 


CO 


Figure 1 A qualitative picture of an asymptotically flat 
3-manifold. 


(and derivatives of the O(1/|x|^) term are O(1/\x|*)), 
then the total mass of (M?, g) is 


m — 2ab [2] 


Furthermore, suppose (M?,g) is any metric whose 
“end” is isometric to (R°\K, u(x)*6;;), where u(x) is 
harmonic in the coordinate chart of the end (R°\ 
K,6;) and goes to a constant at infinity. Then 
expanding u(x) in terms of spherical harmonics 
demonstrates that u(x) satisfies condition [1]. 
We will call these Riemannian manifolds (M^, g) 
“harmonically flat at infinity,” and we note that the 
total mass of these manifolds is also given by eqn [2]. 
A very nice lemma by Schoen and Yau is that, 
given any e > 0, it is always possible to perturb an 
asymptotically flat manifold to become harmoni- 
cally flat at infinity such that the total mass changes 
less than «€ and the metric changes less than 
€ pointwise, all while maintaining non-negative 
scalar curvature. Hence, it happens that to prove 
the theorems in this paper, we only need to consider 
harmonically flat manifolds! Thus, we can use eqn 
[2] as our definition of total mass. As an example, 
note that (R°, 6;) has zero total mass. Also, note 
that, qualitatively, the total mass of an asymptoti- 
cally flat or harmonically flat manifold is the 1/r 
rate at which the metric becomes flat at infinity. 


The Phenomenon of Gravitational Attraction 


What do the above definitions of total mass have to 
do with anything physical? That is, if the total mass 
is the 1/r rate at which the metric becomes flat at 
infinity, what does this have to do with our real- 
world intuitive idea of mass? 

The answer to this question is very nice. Given a 
Schwarzschild spacetime metric 


2|x| 
| (1 mi2lx| ^ 
1 + m/2|x| 
|x| > m/2, for example, note that the t=0 slice 


(which has zero-second fundamental form) is the 
spacelike Schwarzschild metric 


4 
G \ Bm/2(0), (1 + zz) 6 


(discussed more later). Note that according to eqn 
[2], the parameter m is in fact the total mass of this 
3-manifold. 

On the other hand, suppose we were to release a 
small test particle, initially at rest, a large distance r 
from the center of the Schwarzschild spacetime. If 


4 
(r, (1 十 aa) (dxf + dx$ + dx3) 
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this particle is not acted upon by external forces, 
then it should follow a geodesic in the spacetime. It 
turns out that with respect to the asymptotically flat 
coordinate chart, these geodesics “accelerate” 
towards the middle of the Schwarzschild metric 
proportional to m/r? (in the limit as + goes to 
infinity). Thus, our Newtonian notion of mass also 
suggests that the total mass of the spacetime is m. 


Local Energy Density 


Another quantification of mass which is well under- 
stood is local energy density. In fact, in this setting, 
the local energy density at each point is 


rr: 
Y 162 


where R is the scalar curvature of the 3-manifold 
(which has zero-second fundamental form in the 
spacetime) at each point. Note that (R^,6;) has zero 
energy density at each point as well as zero total mass. 
This is appropriate since (R?,6;) is in fact a “t=0” 
slice of Minkowski spacetime, which represents a 
vacuum. Classically, physicists consider jz > 0 to be a 
physical assumption. Hence, from this point on, we 
will not only assume that (M?, g) is asymptotically flat, 
but also that it has non-negative scalar curvature, 


R20 


This notion of energy density also helps us 
understand total mass better. After all, we can take 
any asymptotically flat manifold and then change 
the metric to be perfectly flat outside a large 
compact set, thereby giving the new metric zero 
total mass. However, if we introduce the physical 
condition that both metrics have non-negative scalar 
curvature, then it is a beautiful theorem that this is 
in fact not possible, unless the original metric was 
already (R^, 6j)! (This theorem is actually a corollary 
to the positive mass theorem discussed below.) 
Thus, the curvature obstruction of having non- 
negative scalar curvature at each point is a very 
interesting condition. 

Also, notice the indirect connection between the 
total mass and local energy density. At this point, 
there does not seem to be much of a connection at 
all. The total mass is the 1/r rate at which the metric 
becomes flat at infinity, and local energy density is 
the scalar curvature at each point. Furthermore, if a 
metric is changed in a compact set, local energy 
density is changed, but the total mass is unaffected. 

The reason for this is that the total mass is “not” 
the integral of the local energy density over the 
manifold. In fact, this integral fails to take potential 
energy into account (which would be expected to 
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contribute a negative energy) as well as gravitational 
energy. Hence, it is not initially clear what we should 
expect the relationship between total mass and local 
energy density to be, so let us begin with an example. 


Example Using Superharmonic Functions in R? 


Once again, let us return to the (R? u(x) E) 
example. The formula for the scalar curvature is 


R= —8u(x) ? Au(x) 


Hence, since the physical assumption of non- 
negative energy density implies non-negative scalar 
curvature, we see that u(x) > 0 must be super- 
harmonic (Au <0). For simplicity, let us also 
assume that z(x) is harmonic outside a bounded set 
so that we can expand u(x) at infinity using 
spherical harmonics. Hence, u(x) has the asympto- 
tics of eqn [1]. By the maximum principle, it follows 
that the minimum value for u(x) must be a, referring 
to eqn [1]. Hence, b > 0, which implies that m > 0! 
Thus, we see that the assumption of non-negative 
energy density at each point of (R?, u(x) 6;) implies 
that the total mass is also non-negative, which is 
what one would hope. 


The Positive Mass Theorem 


Why would one hope this? What would be the 
difference if the total mass were negative? This 
would mean that a gravitational system of positive 
energy density could collectively act as a net 
negative total mass. This phenomenon has not 
been observed experimentally, and so it is not a 
property that we would hope to find in general 
relativity. 

More generally, suppose we have any asymptotically 
flat manifold with non-negative scalar curvature, is it 
true that the total mass is also non-negative? The 
answer is yes, and this fact is know as the positive mass 
theorem, first proved by Schoen and Yau (1979) using 
minimal surface techniques and then by Witten (1981) 
using spinors. In the zero-second fundamental form 
case, the positive mass theorem is known as the 
Riemannian positive mass theorem and is stated below. 


Theorem 1 (Schoen, Yau). Let (M?,g) be any 
asymptotically flat, complete Riemannian manifold 
with non-negative scalar curvature. Then the total 
mass m > 0, with equality if and only if (M?,g) is 
isometric to (R° ,6). 


Gravitational Energy 


The previous example neglects to illustrate some of 
the subtleties of the positive mass theorem. For 
example, it is easy to construct asymptotically flat 


manifolds (M?, g) (not conformal to R?) which have 
zero scalar curvature everywhere and yet have 
“nonzero” total mass. By the positive mass theorem, 
the mass of these manifolds is positive. Physically, 
this corresponds to a spacetime with zero energy 
density everywhere which still has positive total 
mass. From where did this mass come? How can a 
vacuum have positive total mass? 

Physicists refer to this extra energy as gravita- 
tional energy. There is no known local definition of 
the energy density of a gravitational field, and 
presumably such a definition does not exist. The 
curious phenomenon, then, is that for some reason, 
gravitational energy always makes a non-negative 
contribution to the total mass of the system. 


Black Holes 


Another very interesting and natural phenomenon in 
general relativity is the existence of black holes. 
Instead of thinking of black holes as singularities in 
a spacetime, we will think of black holes in terms of 
their horizons. For example, suppose we are explor- 
ing the universe in a spacecraft capable of traveling 
at any speed less than the speed of light. If we are 
investigating a black hole, we would want to make 
sure that we don't get too close and get trapped by 
the “gravitational forces" of the black hole. In fact, 
we could imagine a "sphere of no return" beyond 
which it is impossible to escape from the black hole. 
This is called the event horizon of a black hole. 

However, one limitation of the notion of an event 
horizon is that it is very hard to determine its location. 
One way is to let daredevil spacecraft see how close 
they can get to the black hole and still escape from it 
eventually. The only problem with this approach 
(besides the cost in spacecraft) is that it is hard to 
know when to stop waiting for a daredevil spacecraft 
to return. Even if it has been 50 years, it could be that 
this particular daredevil was not trapped by the black 
hole but got so close that it will take it 1000 or more 
years to return. Thus, to define the location of an event 
horizon even mathematically, we need to know the 
entire evolution of the spacetime. Hence, event 
horizons can not be computed based only on the 
local geometry of the spacetime. 

This problem is solved (at least for the mathema- 
tician) with the notion of apparent horizons of black 
holes. Given a surface in a spacetime, suppose that it 
emits an outward shell of light. If the surface area of 
this shell of light is decreasing everywhere on the 
surface, then this is called a trapped surface. The 
outermost boundary of these trapped surfaces is 
called the apparent horizon of the black hole. 
Apparent horizons can be computed based on their 


local geometry, and an apparent horizon always 
implies the existence of an event horizon outside of 
it (Hawking and Ellis 1973). 

Now let us return to the case we are considering 
in this paper where (M?,g) is a “t=O” slice of a 
spacetime with zero-second fundamental form. Then 
it is a very nice geometric fact that apparent 
horizons of black holes intersected with M? corres- 
pond to the connected components of the outermost 
minimal surface X of (M°, g). 

All of the surfaces we are considering in this paper 
will be required to be smooth boundaries of open 
bounded regions, so that outermost is well defined 
with respect to a chosen end of the manifold. 
A minimal surface in (Mł, g) is a surface which is a 
critical point of the area function with respect to any 
smooth variation of the surface. The first variational 
calculation implies that minimal surfaces have zero 
mean curvature. The surface X of (M?, g) is defined 
as the boundary of the union of the open regions 
bounded by all of the minimal surfaces in (MP, g). It 
turns out that X also has to be a minimal surface, 
so we call Xo the “outermost minimal surface." 
A qualitative sketch of an outermost minimal 
surface of a 3-manifold is shown in Figure 2. 

We will also define a surface to be “(strictly) outer 
minimizing” if every surface which encloses it has 
(strictly) greater area. Note that outermost minimal 
surfaces are strictly outer minimizing. Also, we define 
a “horizon” in our context to be any minimal surface 
which is the boundary of a bounded open region. 

It also follows from a stability argument (using 
the Gauss-Bonnet theorem interestingly) that each 
component of an outermost minimal surface (in a 
3-manifold with non-negative scalar curvature) must 
have the topology of a sphere. Furthermore, there is 
a physical argument, based on Penrose (1973), 
which suggests that the mass contributed by the 
black holes (thought of as the connected compo- 


nents of X) should be defined to be ,/Ao/167, 


CO 


Figure 2 A qualitative sketch of an outermost minimal surface 
of a 3-manifold. 
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where Ao is the area of X. Hence, the physical 
argument that the total mass should be greater than 
or equal to the mass contributed by the black holes 
yields the following geometric statement. 


The Riemannian Penrose Inequality Let (M?, g) be 
a complete, smooth, 3-manifold with non-negative 
scalar curvature which is harmonically flat at 
infinity with total mass m and which has an 
outermost minimal surface X of area Ao. Then, 


Ao 
CO oe 
HE N Ier B 
with equality if and only if (M?, g) is isometric to the 
Schwarzschild metric 


UTD 


outside their respective outermost minimal surfaces. 

The above statement has been proved by the 
present author, and Huisken and Ilmanen proved it 
when Ao is defined instead to be the area of the 
largest connected component of Xo. We will discuss 
both approaches in this paper, which are very 
different, although they both involve flowing sur- 
faces and/or metrics. 

We also clarify that the above statement is with 
respect to a chosen end of (M?,g), since both the 
total mass and the definition of outermost refer to a 
particular end. In fact, nothing very important is 
gained by considering manifolds with more than one 
end, since extra ends can always be compactified by 
connect summing them (around a neighborhood of 
infinity) with large spheres while still preserving non- 
negative scalar curvature, for example. Hence, we 
will typically consider manifolds with just one end. In 
the case that the manifold has multiple ends, we will 
require every surface (which could have multiple 
connected components) in this paper to enclose all of 
the ends of the manifold except the chosen end. 


The Schwarzschild Metric 


The Schwarzschild metric 


em (ri) 


referred to in the above statement of the Rieman- 
nian Penrose inequality, is a particularly important 
example to consider, and corresponds to a zero- 
second fundamental form, spacelike slice of the 
usual (3+ 1)-dimensional Schwarzschild metric 
(which represents a spherically symmetric static 
black hole in vacuum). The three-dimensional 
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Schwarzschild metrics have total mass m > 0 and 
are characterized by being the only spherically 
symmetric, geodesically complete, zero scalar curva- 
ture 3-metrics, other than (R?, 6j). They can also be 
embedded in four-dimensional Euclidean space 
(x, y, z, tw) as the set of points satisfying 


(x, y. z)| = (w /8m) + 2m 


which is a parabola rotated around an S*. This last 
picture allows us to see that the Schwarzschild 
metric, which has two ends, has a Z) symmetry 
which fixes the sphere with w=0 and |(x,y,z)| = 
2m, which is clearly minimal. Furthermore, the area 
of this sphere is 47(2m)*, giving equality in the 
Riemannian Penrose inequality. 


A Brief History of the Problem 


The Riemannian Penrose inequality has a rich 
history spanning nearly three decades and has 
motivated much interesting mathematics and phy- 
sics. In 1973, R Penrose in effect conjectured an 
even more general version of inequality [3] using a 
very clever physical argument, which we will not 
have room to repeat here (Penrose 1973). His 
observation was that a counterexample to inequality 
[3] would yield Cauchy data for solving the Einstein 
equations, the solution to which would likely violate 
the cosmic censor conjecture (which says that 
singularities generically do not form in a spacetime 
unless they are inside a black hole). 

Jang and Wald (1977), extending ideas of Geroch, 
gave a heuristic proof of inequality [3] by defining a 
flow of 2-surfaces in (M?,g) in which the surfaces 
flow in the outward normal direction at a rate equal 
to the inverse of their mean curvatures at each point. 
The Hawking mass of a surface (which is supposed 
to estimate the total amount of energy inside the 
surface) is defined to be 


i 2 

"Hawking 22) ES ie a m a ) 
(where |X| is the area of X and H is the mean 
curvature of X in (M?,g)) and, amazingly, is 
nondecreasing under this “inverse mean curvature 
flow." This is seen by the fact that under inverse 
mean curvature flow, it follows from the Gauss 


equation and the second variation formula that 


d np. 1 [vs 
d; "Hawking (2) = i2 1e p^ Hi 


1 
+R—-—2K+—(A; — X? 


2 


when the flow is smooth, where R is the scalar 
curvature of (M?, g), K is the Gauss curvature of the 
surface ©, and A, and X» are the eigenvalues of the 
second fundamental form of X, or principle curva- 
tures. Hence, 


R20 


and 


f Ksar 4] 
5 


(which is true for any connected surface by the 
Gauss-Bonnet theorem) imply 


d 
dt M Hawking (3) 2 0 [5] 
Furthermore, 
b 
M Hawking (Xo) a PS 


since Xo is a minimal surface and has zero mean 
curvature. In addition, the Hawking mass of suffi- 
ciently round spheres at infinity in the asymptotically 
flat end of (M?, g) approaches the total mass m. Hence, 
if inverse mean curvature flow beginning with Xo 
eventually flows to sufficiently round spheres at 
infinity, inequality [3] follows from inequality [5]. 

As noted by Jang and Wald, this argument only 
works when inverse mean curvature flow exists and 
is smooth, which is generally not expected to be the 
case. In fact, it is not hard to construct manifolds 
which do not admit a smooth inverse mean 
curvature flow. The problem is that if the mean 
curvature of the evolving surface becomes zero or is 
negative, it is not clear how to define the flow. 

For 20 years, this heuristic argument lay dormant 
until the work of Huisken and Ilmanen in 1997. With 
a very clever new approach, Huisken and Ilmanen 
discovered how to reformulate inverse mean curvature 
flow using an energy minimization principle in such a 
way that the new generalized inverse mean curvature 
flow always exists. The added twist is that the surface 
sometimes jumps outward. However, when the flow is 
smooth, it equals the original inverse mean curvature 
flow, and the Hawking mass is still monotone. Hence, 
as will be described in the next section, their new flow 
produced the first complete proof of inequality [3] for 
a single black hole. 

Coincidentally, the author found another proof of 
inequality [3], submitted in 1999, which works for any 
number of black holes. The approach involves flowing 
the original metric to a Schwarzschild metric (outside 
the horizon) in such a way that the area of the 
outermost minimal surface does not change and the 


total mass is nonincreasing. Then, since the Schwarzs- 
child metric gives equality in inequality [3], the 
inequality follows for the original metric. 

Fortunately, the flow of metrics which is defined 
is relatively simple, and in fact stays inside the 
conformal class of the original metric. The outer- 
most minimal surface flows outwards in this 
conformal flow of metrics, and encloses any 
compact set (and hence all of the topology of the 
original metric) in a finite amount of time. Further- 
more, this conformal flow of metrics preserves non- 
negative scalar curvature. We will describe this 
approach later in the paper. 

Other contributions on the Penrose conjecture 
have also been made by Herzlich using the Dirac 
operator which Witten used to prove the positive 
mass theorem, by Gibbons in the special case of 
collapsing shells, by Tod, by Bartnik for quasi- 
spherical metrics, and by the present author using 
isoperimetric surfaces, There is also some interesting 
work of Ludvigsen and Vickers using spinors and 
Bergqvist, both concerning the Penrose inequality 
for null slices of a spacetime. 


Inverse Mean Curvature Flow 


Geometrically, Huisken and Ilmanen's idea can be 
described as follows. Let X(t) be the surface 
resulting from inverse mean curvature flow for 
time £ beginning with the minimal surface po. 
Define X(t) to be the outermost minimal area 
enclosure of X(t). Typically, X(z) — X(t) in the flow, 
but in the case that the two surfaces are not equal, 
immediately replace S(t) with X(z) and then con- 
tinue flowing by inverse mean curvature. 

An immediate consequence of this modified flow is 
that the mean curvature of X(t) is always non-negative 
by the first variation formula, since otherwise X(t) 
would be enclosed by a surface with less area. This is 
because if we flow a surface X in the outward 
direction with speed 7, the first variation of the area 
is f Hr, where H is the mean curvature of X. 

Furthermore, by stability, it follows that in the 
regions where X(t) has zero mean curvature, it is 
always possible to flow the surface out slightly to 
have positive mean curvature, allowing inverse mean 
curvature flow to be defined, at least heuristically at 
this point. 

Furthermore, the Hawking mass is still monotone 
under this new modified flow. Notice that when X(t) 
jumps outwards to X(t), 


| H? < fi H? 
E(t) E(t) 
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since X(t) has zero mean curvature where the two 
surfaces do not touch. Furthermore, 


| = Ix) 


since (this is a neat argument) |X(z)| < |E(t)| (since 
X(t) is a minimal area enclosure of X(1)) and we 
cannot have |X(z)| < |X(£)| since H(t) would have 
jumped outwards at some earlier time. This is only a 
heuristic argument, but we can then see that the 
Hawking mass is nondecreasing during a jump by 
the above two equations. 

This new flow can be rigorously defined, always 
exists, and the Hawking mass is monotone. Huisken 
and Ilmanen define X(t) to be the level sets of a 
scalar valued function u(x) defined on (M?, g) such 
that u(x) — 0 on the original surface X9 and satisfies 


div e = Ful 6] 


in an appropriate weak sense. Since the left-hand 
side of the above equation is the mean curvature of 
the level sets of u(x) and the right-hand side is the 
reciprocal of the flow rate, the above equation 
implies inverse mean curvature flow for the level sets 
of u(x) when |Vu(x)| £ 0. 

Huisken and Ilmanen use an energy minimization 
principle to define weak solutions to eqn [6]. 
Equation [6] is said to be weakly satisfied in €) by 
the locally Lipschitz function z if for all locally 
Lipschitz v with {v Z u} CC Q, 


Julau) < Ju(v) 


where 


je $ IVv| + v|Vu| 


It can then be seen that the Euler-Lagrange equation 
of the above energy functional yields eqn [6]. 

In order to prove that a solution z exists to the above 
two equations, Huisken and Ilmanen regularize the 
degenerate elliptic equation 6 to the elliptic equation 


die | ——ÓM— Len d I 
\/|Vul? + 2 


Solutions to the above equation are then shown to 
exist using the existence of a subsolution, and then 
taking the limit as € goes to zero yields a weak 
solution to eqn [6]. There are many details which we 
are skipping here, but these are the main ideas. 

As it turns out, weak solutions u(x) to eqn [6] 
often have flat regions where u(x) equals a 
constant. Hence, the level sets X(t) of u(x) will be 
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discontinuous in £ in this case, which corresponds 
to the “jumping out” phenomenon referred to at 
the beginning of this section. 

We also note that since the Hawking mass of the 
level sets of u(x) is monotone, this inverse mean 
curvature flow technique not only proves the 
Riemannian Penrose inequality, but also gives a 
new proof of the positive mass theorem in dimen- 
sion 3. This is seen by letting the initial surface be a 
very small, round sphere (which will have approxi- 
mately zero Hawking mass) and then flowing by 
inverse mean curvature, thereby proving m > 0. 

The Huisken and Ilmanen inverse mean curvature 
flow also seems ideally suited for proving Penrose 
inequalities for 3-manifolds which have R > —6 and 
which are asymptotically hyperbolic. This situation 
occurs if (M?,g) is chosen to be a constant mean 
curvature slice of the spacetime or if the spacetime is 
defined to solve the Einstein equation with nonzero 
cosmological constant. In these cases, there exists a 
modified Hawking mass which in monotone under 
inverse mean curvature flow which is the usual 
Hawking mass plus 4(|X|/167)?. However, because 
the monotonicity of the Hawking mass relies on the 
Gauss-Bonnet theorem, these arguments do not work 
in higher dimensions, at least so far. Also, because 
of the need for eqn [4], inverse mean curvature 
flow only proves the Riemannian Penrose inequality 
for a single black hole. In the next section, we 
present a technique which proves the Riemannian 
Penrose inequality for any number of black holes, 
and which can likely be generalized to higher 
dimensions. 


The Conformal Flow of Metrics 


Given any initial Riemannian manifold (M°, go) 
which has non-negative scalar curvature and which 
is harmonically flat at infinity, we will define a 
continuous, one-parameter family of metrics (M, g+), 
0 < t< oc. This family of metrics will converge to a 
three-dimensional Schwarzschild metric and will have 
other special properties which will allow us to prove 
the Riemannian Penrose inequality for the original 
metric (M^, go). 

In particular, let Xo be the outermost minimal 
surface of (Mi, go) with area Ao. Then, we will also 
define a family of surfaces X(t) with X(0) — Xo such 
that E(t) is minimal in (M?, g;). This is natural since 
as the metric g, changes, we expect that the location 
of the horizon S(t) will also change. Then, the 
interesting quantities to keep track of in this flow are 
A(t), the total area of the horizon X(t) in (M°, g+), 
and m(t), the total mass of (M?, g+) in the chosen end. 


In addition to all of the metrics g; having non- 
negative scalar curvature, we will also have the very 
nice properties that 


for all t> 0. Then, since (M?,g;) converges to a 
Schwarzschild metric (in an appropriate sense) 
which gives equality in the Riemannian Penrose 
inequality as described in the introduction, 
A(oo) | /A(0) 
MAS ce oe 7 N Ier 
which proves the Riemannian Penrose inequality for 
the original metric (M?, go). The hard part, then, is 
to find a flow of metrics which preserves non- 
negative scalar curvature and the area of the 
horizon, decreases total mass, and converges to a 
Schwarzschild metric as £ goes to infinity. 


[7] 


The Definition of the Flow 


In fact, the metrics g, will all be conformal to go. 
This conformal flow of metrics can be thought of as 
the solution to a first-order ODE in t defined by 
eqns [8]-[11]. Let 


gi = m(x) go [8] 
and uo(x) = 1. Given the metric g,, define 
E(t) = the outermost minimal area 
enclosure of Yo in (M?, g;) [9| 
where X is the original outer minimizing horizon in 
(M3, go). In the cases in which we are interested, E(t) 
will not touch Xo, from which it follows that E(ż) is 


actually a strictly outer minimizing horizon of (M, g;). 
Then given the horizon S(t), define v,(x) such that 


Ag, Ui(x) s 0 outside E(t) 
v(x) =0 on X(t) [10] 
lim v(x) = —-e* 


and v;(x) = 0 inside S(t). Finally, given v;(x), define 
t 

u(xX)= 1 + | v;(x) ds [11] 
0 


so that w(x) is continuous in £ and has uo(x) = 1. 
Note that eqn [11] implies that the first-order rate 
of change of u;(x) is given by v;(x). Hence, the first- 
order rate of change of g; is a function of itself, go, 
and v(x) which is a function of go, t, and E(t) which 
is in turn a function of g; and X. Thus, the first-order 
rate of change of g, is a function of t, g;, go, and Xo. 


Theorem 2 Taken together, eqns [8|-[11] define a 
first-order ODE in t for u(x) which bas a solution 
which is Lipschitz in the t variable, C! in the x 
variable everywhere, and smooth in the x variable 
outside X(t). Furthermore, X(t) is a smooth, strictly 
outer minimizing horizon in (M?,g;) for all t > 0, 
and Y(t») encloses but does not touch X(t1) for all 
tr 51290. 


Since v,(x) is a superharmonic function in (M?, go) 
(harmonic everywhere except on S(t), where it is 
weakly superharmonic), it follows that u(x) is super- 
harmonic as well. Thus, from eqn [11] we see that 
lim, 4,4; (x) =e and consequently that u(x) > 0 
for all t by the maximum principle. Then, since 


R(g;) = u(x) (一 8Aw + R(go))t(x) [12] 


it follows that (M?,g;) is an asymptotically flat 
manifold with non-negative scalar curvature. 

Even so, it still may not seem like g, is particularly 
naturally defined since the rate of change of g, appears 
to depend on t and the original metric go in eqn [10]. 
We would prefer a flow where the rate of change of g; 
can be defined purely as a function of g, (and No 
perhaps), and interestingly enough this actually does 
turn out to be the case! The present author has proved 
this very important fact and defined a new equivalence 
class of metrics called the harmonic conformal class. 
Then, once we decide to find a flow of metrics which 
stays inside the harmonic conformal class of the 
original metric (outside the horizon) and keeps the 
area of the horizon E(t) constant, then we are basically 
forced to choose the particular conformal flow of 
metrics defined above. 


Theorem 3 The function A(t) is constant in t and 
m(t) is nonincreasing in t, for all t > 0. 


The fact that A’(t)=0 follows from the fact that 
to first order the metric is not changing on X(t) 
(since v,(x) — 0 there) and from the fact that to first 
order the area of X(t) does not change as it moves 
outward since X(t) is a critical point for area in 
(MP, g,). Hence, the interesting part of Theorem 3 is 
proving that m’(t) < 0. Curiously, this follows from 
a nice trick using the Riemannian positive mass 
theorem, which we describe later. 

Another important aspect of this conformal flow of 
the metric is that outside the horizon X(t), the manifold 
(M^, g;) becomes more and more spherically sym- 
metric and “approaches” a Schwarzschild manifold 
(R°\ {0}, s) in the limit as t goes to oo. More precisely, 


Theorem 4 For sufficiently large t, tbere exists a 
diffeomorphism p, between (M?,g;) outside tbe 
horizon X(t) and a fixed Schwarzschild manifold 
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(R?N(0], s) outside its horizon. Furthermore, for all 
c > 0, there exists a T such that for all t > T, the 
metrics g, and ;(s) (when determining the lengths 
of unit vectors of (M?, g;)) are within e of each other 
and the total masses of the 2-manifolds are within 
ec of each other. Hence, 


mt) — | 1 
VA(t) | 167 


. Theorem 4 is not that surprising really although a 
careful proof is reasonably long. However, if one is 
willing to believe that the flow of metrics converges 
to a spherically symmetric metric outside the 
horizon, then Theorem 4 follows from two facts. 
The first fact is that the scalar curvature of (M?, g;) 
eventually becomes identically zero outside the 
horizon X(t) (assuming (M?,go) is harmonically 
flat). This follows from the facts that E(t) encloses 
any compact set in a finite amount of time, that 
harmonically flat manifolds have zero scalar curva- 
ture outside a compact set, that z;(x) is harmonic 
outside X(t), and eqn [12]. The second fact is that 
the Schwarzschild metrics are the only complete, 
spherically symmetric 3-manifolds with zero scalar 
curvature (except for the flat metric on R?). 

The Riemannian Penrose inequality, inequality 
[3], then follows from eqn [7] using Theorems 2-4, 
for harmonically flat manifolds. Since asymptoti- 
cally flat manifolds can be approximated arbitrarily 
well by harmonically flat manifolds while changing 
the relevant quantities arbitrarily little, the asymp- 
totically flat case also follows. Finally, the case of 
equality of the Penrose inequality follows from a 
more careful analysis of these same arguments. 


im 
t—oo 


Qualitative Discussion 


Figures 3 and 4 are meant to help illustrate some of the 
properties of the conformal flow of the metric. Figure 3 
is the original metric which has a strictly outer 
minimizing horizon Xo. As t increases, Y(t) moves 
outwards, but never inwards. In Figure 4, we can 
observe one of the consequences of the fact that 
A(t)— Ag is constant in £. Since the metric is not 
changing inside Xt), all of the horizons (s), 0 € s € t 


huh 


Figure 3 Original metric having a strictly outer minimizing 
horizon Xo. 


(M8, 9o) 
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(M^, 9) 


E(t) 


L 


Figure 4 Metric after time t. 


have area Ay in (M?,g,). Hence, inside X(t), the 
manifold (M?, g;) becomes cylinder-like in the sense 
that it is laminated (i.e., foliated but with some gaps 
allowed) by all of the previous horizons which all have 
the same area Ag with respect to the metric g;. 

Now let us suppose that the original horizon Xo 
of (M?,g) had two components, for example. Then 
each of the components of the horizon will move 
outwards as ¢ increases, and at some point before 
they touch they will suddenly jump outwards to 
form a horizon with a single component enclosing 
the previous horizon with two components. Even 
horizons with only one component will sometimes 
jump outwards, but no more than a countable 
number of times. It is interesting that this phenom- 
enon of surfaces jumping is also found in the 
Huisken-Ilmanen approach to the Penrose conjec- 
ture using their generalized 1/H flow. 


Proof that m'(t) < 0 


The most surprising aspect of the flow defined 
earlier is that »'(t) € 0. As mentioned in that 
section, this important fact follows from a nice 
trick using the Riemannian positive mass theorem. 

The first step is to realize that while the rate of 
change of g; appears to depend on and go, this is in 
fact an illusion. As described in detail by Bray, the 
rate of change of g; can be described purely in terms 
of g; (and Xo). It is also true that the rate of change 
of g; depends only on g, and X(t). Hence, there is no 
special value of t, so proving »' (t) € 0 is equivalent 
to proving m’(0) < 0. Thus, without loss of general- 
ity, we take t=0 for convenience. 

Now expand the harmonic function vo(x), defined 
in eqn [10], using spherical harmonics at infinity, to get 


C 1 
vo(x) = —1 TET o(a) [13] 


for some constant c. Since the rate of change of the 
metric g; at t= 0 is given by vo(x) and since the total 


mass m(t) depends on the 1/r rate at which the 
metric g; becomes flat at infinity (see eqn [2]), it is 
not surprising that direct calculation gives us that 


m'(0) = 2(c — m(0)) 


Hence, to show that ;7(0) < 0, we need to show 
that 


c € m(0) [14] 


In fact, counterexamples to eqn [14] can be found 
if we remove either of the requirements that X(0) 
(which is used in the definition of vo(x)) be a 
minimal surface or that (M?,go) have non-negative 
scalar curvature. Hence, we quickly see that eqn 
[14] is a fairly deep conjecture which says something 
quite interesting about manifold with non-negative 
scalar curvature. Well, the Riemannian positive 
mass theorem is also a deep conjecture which says 
something quite interesting about manifolds with 
non-negative scalar curvature. Hence, it is natural to 
try to use the Riemannian positive mass theorem to 
prove eqn [14]. 

Thus, we want to create a manifold whose total 
mass depends on c from eqn [13]. The idea is to use 
a reflection trick similar to one used by Bunting and 
Masood-ul-Alam (1987) for another purpose. First, 
remove the region of M? inside X(0) and then reflect 
the remainder of (M?,go) through X(0). Define the 
resulting Riemannian manifold to be (M?,g5) which 
has two asymptotically flat ends since (M?, go) has 
exactly one asymptotically flat end not contained by 
X(0). Note that (M?,go) has non-negative scalar 
curvature everywhere except on X(0) where the 
metric has corners. In fact, the fact that X(0) has 
zero mean curvature (since it is a minimal surface) 
implies that (M?,go) has “distributional” non- 
negative scalar curvature everywhere, even on X(0). 
This notion is made rigorous by Bray. Thus, we have 
used the fact that X(0) is minimal in a critical way. 

Recall from eqn [10] that vo(x) was defined to be 
the harmonic function equal to zero on X(0) which 
goes to —1 at infinity. We want to reflect vo(x) to be 
defined on all of (M?, go). The trick here is to define 
vo(x) on (M?, go) to be the harmonic function which 
goes to —1 at infinity in the original end and goes to 
1 at infinity in the reflect end. By symmetry, vo(x) 
equals 0 on X(0) and so agrees with its original 
definition on (M^, go). 

The next step is to compactify one end of (M?, go). 
By the maximum principle, we know that vo(x) > —1 
and c > 0, so the new Riemannian manifold (MP, 
(vo(x) + 1)*go) does the job quite nicely and compac- 
tifies the original end to a point. In fact, the 
compactified point at infinity and the metric there 


can be filled in smoothly (using the fact that (M?, go) is 
harmonically flat). It then follows from eqn [12] that 
this new compactified manifold has non-negative 
scalar curvature since vo(x) + 1 is harmonic. 

The last step is simply to apply the Riemannian 
positive mass theorem to (MP, (vo(x) + 1)^go). It is 
not surprising that the total mass m(Q0) of this 
manifold involves c, but it is quite lucky that direct 
calculation yields 


m(0) = —4(c — m(0)) 


which must be positive by the Riemannian positive 
mass theorem. Thus, we have that 


m'(0) = 2(c — m(0)) = —4n(0) < 0 


Open Questions and Applications 


Now that the Riemannian Penrose conjecture has been 
proved, what are the next interesting directions? What 
applications can be found? Is this subject only of 
physical interest, or are there possibly broader 
applications to other problems in mathematics? 

Clearly, the most natural open problem is to find a 
way to prove the general Penrose inequality in which 
M? is allowed to have any second fundamental form in 
the spacetime. There is good reason to think that this 
may follow from the Riemannian Penrose inequality, 
although this is a bit delicate. On the other hand, the 
general positive mass theorem followed from the 
Riemannian positive mass theorem as was originally 
shown by Schoen and Yau using an idea due to Jang. 
For physicists, this problem is definitely a top priority 
since most spacetimes do not even admit zero-second 
fundamental form spacelike slices. 

Another interesting question is to ask these same 
questions in higher dimensions. The author is currently 
working on a paper to prove the Riemannian Penrose 
inequality in dimensions «8. Dimension 8 and higher 
are harder because of the surprising fact that minimal 
hypersurfaces (and hence apparent horizons of black 
holes) can have codimension 7 singularities (points 
where the hypersurface is not smooth). This curious 
technicality is also the reason that the positive mass 
conjecture is still open in dimensions 8 and higher for 
manifolds which are not spin. 

Naturally, it is harder to tell what the applications 
of these techniques might be to other problems, but 
already there have been some. One application is to 
the famous Yamabe problem: given a compact 
3-manifold M?, define E(g) = fy R; dV, where g is 
scaled so that the total volume of (M?, g) is 1, R, is 
the scalar curvature at each point, and dV, is the 
volume form. An idea due to Yamabe was to try to 
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construct canonical metrics on M? by finding critical 
points of this energy functional on the space of metrics. 
Define C(g) to be the infimum of E(g) over all metrics g 
conformal to g. Then the (topological) Yamabe 
invariant of M?, denoted here as Y(M?), is defined to 
be the supremum of C(g) over all metrics g. Y(S3) — 6 - 
(222)^? = Y, is known to be the largest possible value 
for Yamabe invariants of 3-manifolds. It is also known 
that Y(T?)20 and Y(S*xS!)=Y,= Y(S^xS!), 
where $? XS! is the nonorientable S? bundle over S! . 

The author, working with Andre Neves on a 
problem suggested by Richard Schoen, recently was 
able to compute the Yamabe invariant of RP? using 
inverse mean curvature flow techniques and found 
that Y(RP?) = Y1/2?? = Y>. A corollary is Y(RP? x 
S!)= Y; as well. These techniques also yield the 
surprisingly strong result that the only prime 3-mani- 
folds with Yamabe invariant larger than RP? are 
S?. S? x S!, and S?xS!. The Poincare conjecture for 
3-manifolds with Yamabe invariant greater than RP? 
is therefore a corollary. Furthermore, the problem of 
classifying 3-manifolds is known to reduce to the 
problem of classifying prime 3-manifolds. The 
Yamabe approach then would be to make a list of 
prime 3-manifolds ordered by Y. The first five prime 
3-manifolds on this list are therefore $?,8? x 
$1, 8? x8!, RP?, and RP? x S!. 
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Introduction 


The aim of these pages is to give a brief, self- 
contained introduction to that part of geometric 
measure theory which is more directly related to the 
calculus of variations, namely the theory of currents 
and its applications to the solution of Plateau 
problem. (The theory of finite-perimeter sets, which 
is closely related to currents and to the Plateau 
problem, is treated in the article Free Interfaces and 
Free Discontinuities: Variational Problems in the 
Encyclopedia.) 

Named after the Belgian physicist JAF Plateau 
(1801-1883), this problem was originally formulated 
as follows: find the surface of minimal area spanning 
a given curve in the space. Nowadays, it is mostly 
intended in the sense of developing a mathematical 
framework where the existence of k-dimensional 
surfaces of minimal volume that span a prescribed 
boundary can be rigorously proved. Indeed, several 
solutions have been proposed in the last century, 
none of which is completely satisfactory. 

One difficulty is that the infimum of the area 
among all smooth surfaces with a certain boundary 
may not be attained. More precisely, it may happen 
that all minimizing sequences (i.e., sequences of 
smooth surfaces whose area approaches the infimum) 
converge to a singular surface. Therefore, one is 
forced to consider a larger class of admissible surfaces 
than just smooth ones (in fact, one might want to do 
this also for modeling reasons - this is indeed the case 
with soap films, soap bubbles, and other capillarity 
problems). But what does it mean that a set “spans” a 
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given curve? and what should we intend by area of a 
set which is not a smooth surface? 

The theory of integral currents developed by 
Federer and Fleming (1960) provides a class of 
generalized (oriented) surfaces with well-defined 
notions of boundary and area (called mass) where 
the existence of minimizers can be proved by direct 
methods. More precisely, this class is large enough 
to have good compactness properties with respect 
to a topology that makes the mass a lower- 
semicontinuous functional. This approach turned 
out to be quite powerful and flexible, and in the 
last decades the theory of currents has found 
applications in several different areas, from dyna- 
mical systems (in particular, Mather theory) to the 
theory of foliations, to optimal transport problems. 


Hausdorff Measures, Dimension, 
and Rectifiability 


The volume of a smooth d-dimensional surface in 
R” is usually defined using parametrizations by 
subsets of R^. The notion of Hausdorff measure 
allows to compute the d-dimensional volume using 
coverings instead of parametrizations, and, what is 
more important, applies to all sets in R”, and makes 
sense even if d is not an integer. Attached to 
Hausdorff measure is the notion of Hausdorff 
dimension. Again, it can be defined for all sets in 
R" and is not necessarily an integer. The last 
fundamental notion is rectifiability: k-rectifiable 
sets can be roughly understood as the largest class 
of k-dimensional sets for which it is still possible to 
define a k-dimensional tangent bundle, even if only 
in a very weak sense. They are essential to the 
construction of integral currents. 


Hausdorff Measure 


Let d > 0 be a positive real number. Given a set E in 
R", for every 6 > 0 we set 


7 4(E) 一 34inf { Y diam(E)) | [1] 


where wg is the d-dimensional volume of the unit 
ball in R^ whenever d is an integer (there is no 
canonical choice for wy when d is not an integer; 
a convenient one is w!4— 27), and the infimum is 
taken over all countable families of sets {Ej} that 
cover E and whose diameters satisfy diam(E;) < 6. 
The d-dimensional Hausdorff measure of E is 


z*(E):— lim 7 4(E) [2] 


(the limit exists because 7 (E) is decreasing in ô). 
Remarks 


(i) #4 is called d-dimensional because of its 
scaling behavior: if Ej is a copy of E scaled 
homothetically by a factor A, then 


z (Ey) = HME) 


Thus, ~! scales like the length, 7? scales like the 
area, and so on. 

(ii) The measure 7^ is clearly invariant under 
rigid motions (translations and rotations). This 
implies that 7^ agrees on R^ with the Lebesgue 
measure up to some constant factor; the renorma- 
lization constant w4/2^ in [1] makes this factor 
equal to 1. Thus, ¥4(E) agrees with the usual 
d-dimensional volume for every set E in R^, and 
the area formula shows that the same is true if 
E is (a subset of) a d-dimensional surface of class C! 
in R”. 

(iii) Besides the Hausdorff measure, there are 
several other, less popular notions of d-dimensional 
measure: all of them are invariant under rigid motion, 
scale in the expected way, and agree with 7^ for sets 
contained in R or in a d-dimensional surface of class 
Cl, and yet they differ for other sets (for further 
details, see Federer (1996, section 2.10)). 

(iv) The definition of #4(E) uses only the notion 
of diameter, and therefore makes sense when E is a 
subset of an arbitrary metric space. Note that 74(E) 
depends only on the restriction of the metric to E, 
and not on the ambient space. 

(v) The measure # is countably additive on 
the o-algebra of Borel sets in R”, but not on all sets; 
to avoid pathological situations, we shall always 
assume that sets and maps are Borel measurable. 


d 
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Hausdorff Dimension 


According to intuition, the length of a surface 
should be infinite, while the area of a curve should 
be null. These are indeed particular cases of the 
following implications: 


v4(E)>0 => #7(E)=c ford’ <d 
Z4(E)«oo => #7(E)=0 ford'»d 


Hence, the infimum of all d such that 74(E)=0 and 
the supremum of all d such that 7 dIE) — oo 
coincide. This number is called Hausdorff dimension 
of E, and denoted by dimy (E). For surface of class 
C!, the notion of Hausdorff dimension agrees with 
the usual one. Example of sets with nonintegral 
dimension are described in the next subsection. 


Remarks 


(i) Note that 77(E) may be 0 or oo even for 
i= dimy (E). 

(ii) The Hausdorff dimension of a set E is strictly 
related to the metric on E, and not just to the 
topology. Indeed, it is preserved under diffeomorph- 
isms but not under homeomorphisms, and it does 
not always agree with the topological dimension. 
For instance, the Hausdorff dimension of the graph 
of a continuous function f:R—R can be any 
number between 1 and 2 (included). 

(iii) For nonsmooth sets, the Hausdorff dimension 
does not always conform to intuition: for example, 
the dimension of a Cartesian product E x F of 
compact sets does not agree in general with the sum 
of the dimensions of E and F. 

(iv) There are many other notions of dimension 
besides Hausdorff and topological ones. Among 
these, packing dimension and box-counting dimen- 
sion have interesting applications (see Falconer 
(2003, chapters 3 and 4)). 


Self-Similar Fractals 


Interesting examples of sets with nonintegral dimen- 
sion are self-similar fractals. We present here a 
simplified version of a construction due to Hutchinson 
(Falconer 2003, chapter 9). Let {W;} be a finite set of 
similitudes of R" with scaling factor A; < 1, and 
assume that there exists a bounded open set V such 
that the sets V;:— V;(V) are pairwise disjoint and 
contained in V. The self-similar fractal associated with 
the system {W;} is the compact set C that satisfies 


C =| YC) [3] 


The term “self-similar” follows by the fact that C 
can be written as a union of scaled copies of itself. 


522 Geometric Measure Theory 


The existence (and uniqueness) of such a C follows 
from a standard fixed-point argument applied to the 
map C—|Jw;(C). The dimension d of C is the 
unique solution of the equation 


2 X =1 [4] 


Formula [4] can be easily justified: if the sets Y;(C) 
are disjoint — and the assumption on V implies that 
this almost the case — then [3] implies 74(C) = 
y x4 (U(C)) 2 XA ~4(C), and therefore ¥4(C) can 
be positive and finite if and only if d satisfies [4]. 
An example of this construction is the usual 
Cantor set in R, which is given by the similitudes 


Wi (x) := ix and V;(x) :- $ - ix 


By [4], its dimension is d— log2/log3. Other 
examples are described in Figures 1-3. 


Rectifiable Sets 


Given an integer k= 1,...,7, we say that a set E in 
R" is k-rectifiable if it can be covered by a countable 
family of sets {S;} such that So is w* negligible (i.e., 
7 (Sg) — 0) and S; is a k-dimensional surface of class 
C! for j=1,2,... Note that dimy (E) € k because 
each S; has dimension k. 

A k-rectifiable set E bears little resemblance to 
smooth surfaces (it can be everywhere dense!), but it 
still admits a suitably weak notion of tangent bundle. 


Figure 1 The maps Y ;,/— 1,...,4, take the square V into the 
squares V; at the corners of V. The scaling factor is A for all /, 
hence dimy (C) = log 4/(—log A). Note that dim, (C) can be any 
number between 0 and 2, including 1. 
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Figure 2 A self-similar fractal with more complicated topology. 
The scaling factor is 1/4 for all twelve similitudes, hence 
dimp (C) = log 12/ log 4. 


Figure 3 The von Koch curve (or snowflake). The scaling 
factor is 1/3 for all four similitudes, hence dim, (C) = log 4/ log 3. 


More precisely, it is possible to associate with every 
x € E a k-dimensional subspace of R”, denoted by 
Tan(E, x), so that for every k-dimensional surface S 
of class C! in R” there holds 

Tan(E,x) =Tan(S,x) for v"-a.ee. x € ENS [S] 
where Tan(S,x) is the tangent space to S at x 
according to the usual definition. 

It is not difficult to see that Tan(E, x) is uniquely 
determined by [5] up to an 7 ^-negligible amount of 
points x € E, and if E is a surface of class C!, then it 
agrees with the usual tangent space for »^-almost 
all points of E. 


Remarks 


(i) In the original definition of rectifiability, the 
sets S; with j > 0 are Lipschitz images of R*, that is, 
S; :— f (RA), where f:R >R” is a Lipschitz map. 
It can be shown that this definition is equivalent to 
the one above. 

(ii) The construction of the tangent bundle is 
straightforward: Let {S;} be a covering of E as earlier, 
and set Tan(E, x) :— Tan(S;, x), where j is the smallest 
positive integer such that x € Sj. Then [5] is an 
immediate corollary of the following lemma: if S and 
S’ are k-dimensional surfaces of class C! in R”, then 
Tan(S,x) = Tan(S', x) for 7^-almost every x € SNS’. 

(iii) A set E in R" is called purely k-unrectifiable 
if it contains no k-rectifiable subset with posi- 
tive k-dimensional measure, or, equivalently, if 
¥*(EQS)=0 for every k-dimensional surface S of 
class Cl. For instance, every product E:— E, x E2, 
where E; and E; are »!-negligible sets in R is a 
purely 1-unrectifiable set in R? (it suffices to show 
that ¥'(EMS)=0 whenever S is the graph of a 
function f : R — R. of class C!, and this follows by 
the usual formula for the length of the graph). Note 
that the Hausdorff dimension of such product sets 
can be any number between 0 and 2, hence 
rectifiability is not related to dimension. The self- 
similar fractals described in Figures 1 and 3 are both 
purely 1-unrectifiable. 


Rectifiable Sets with Finite Measure 


If E is a k-rectifiable set with finite (or locally finite) 
k-dimensional measure, then Tan(E,x) can be 
related to the behavior of E close to the point x. 

Let B(x,r) be the open ball in R" with center x 
and radius r, and let C(x, T,a) be the cone with 
center x, axis T — a k-dimensional subspace of R” — 
and amplitude a= arcsin a, that is, 


C(x, T,a):= {x € R" dist(x' ^ x, T) € a|x' — x|] 


Figure 4 A rectifiable set E close to a point x of approximate 
tangency. The part of E contained in the ball B(x, r) but not in the 
cone C(x, T, a) is not empty, but only small in measure. 


For z^-almost every x € E, the measure of EN 
B(x,r) is asymptotically equivalent, as r — 0, to the 
measure of a flat disk of radius r, that is, 


£^ (EN B(x,r)) ~ wer* 


Moreover, the part of E contained in B(x,r) is 
mostly located close to the tangent plane Tan(E, x), 
that is, 


z^(En B(x,r) N C(x, Tan(E,x),a)) ~ wr 
for every a > 0 


When this condition holds, Tan(E,x) is called the 
approximate tangent space to E at x (see Figure 4). 


The Area Formula 


The area formula allows to compute the measure 
y*(@(E)) of the image of a set E in R* as the 
integral over E of a suitably defined Jacobian 
determinant of ®. When 4 is injective and takes 
values in R*, we recover the usual change of 
variable formula for multiple integrals. 

We consider first the linear case. If L is a linear 
map from R? to R” with m > k, the volume ratio 
p:= *"(L(E)/**(E) does not depend on E, and 
agrees with |det(PL), where P is any linear 
isometry from the image of L into R^, and det (PL) 
is the determinant of the k x k matrix associated 
with PL. The volume ratio p can be computed using 
one of the following identities: 


p = y/det(L*L) = 4/ > (det M) [6] 


where L* is the adjoint of L (thus, L*L is a linear 
map from R* into R^), and the sum in the last term 
is taken over all k xk minors M of the matrix 
associated with L. 

Let 6: R? — R” be a map of class C! with m > k, 
and E a set in R^. Then 


f $e 9)nBas*- [ Jedra) 7 
P(E) E 
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where #A stands for the number of elements of A, 
and the Jacobian J is 


J(x):= 4/ det(V®(x)"V®(x)) [8] 


Note that the left-hand side of [7] is ~*(®(E)) when 
® is injective. 


Remark 

Formula [7] holds even if E is a k-rectifiable set in R”. 
In this case, the gradient V®(x) in [8] should be 
replaced by the tangential derivative of ® at x (viewed 
as a linear map from Tan(E, x) into R”). No version of 
formula [7] is available when E is not rectifiable. 


Vectors, Covectors, and Differential 
Forms 


In this section, we review some basic notions of 
multilinear algebra. We have chosen a definition of 
k-vectors and k-covectors in R", and of the corres- 
ponding exterior products, which is quite convenient 
for computations, even though not as satisfactory from 
the formal viewpoint. The main drawback is that it 
depends on the choice of a standard basis of R", and 
therefore cannot be used to define forms (and currents) 
when the ambient space is a general manifold. 


k-Vectors and Exterior Product 


Let [e1,...,6e,] be the standard basis of R”. Given an 
integer k < n, I(n, k) is the set of all multi-indices 
i—(i,..,4) with 1<4<i<---<i, <n, and 
for every i € I(», k) we introduce the expression 


ei = €i, ^ €i; ^*** A^ Ei, 


A k-vector in R" is any formal linear combination 
Xo; e; with a; € R for every i € I(n, k). The space of 
k-vectors is denoted by ^A,(R"); in particular, 
A,(R”)=R”. For reasons of formal convenience, 
we set Ao(R") :- R and A,(R") :— {0} for k > n. 

We denote by |. | the Euclidean norm on ^,(R"). 

The exterior product v ^ w € ^4,,(R") is defined 
for every v € A((R") and we€EA,(R”), and is 
completely determined by the following properties: 
(1) associativity, (2) linearity in both arguments, and 
(3) ej ^ e; = —e; ^ ej for every i Z j and e; ^ e; — 0 for 
every i. 


Simple Vectors and Orientation 


A simple k-vector is any v in A(R”) that can be 
written as a product of 1-vectors, that is, 


ú = V1 AVA“ NVR 


It can be shown that v is null if and only if the 
vectors {v;} are linearly dependent. If v is not null, 
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then it is uniquely determined by the following 
objects: (1) the k-dimensional space M spanned 
by {v;i}; (2) the orientation of M associated with the 
basis {v;}; (3) the euclidean norm |v|. In particular, 
M does not depend on the choice of the vectors v;. 
Note that |v| is equal to the k-dimensional volume of 
the parallelogram spanned by (v;]. 

Hence, the map v — M is a one-to-one correspon- 
dence between the class of simple k-vectors with 
norm |v|—-1 and the Grassmann manifold of 
oriented k-dimensional subspaces of R". 

This remark paves the way to the following definition: 
if S is a k-dimensional surface of class C! in R”, possibly 
with boundary, an orientation of $ is a continuous map 
Tg: $ — Apk (R") such that 75(x) is a simple k-vector with 
norm 1 that spans Tan(S,x) for every x. With every 
orientation of S (if any exists) is canonically associated 
the orientation of the boundary OS that satisfies 


Ts(x) = (x) ^Tos(x) foreveryx€ OS [9] 


where n(x) is the inner normal to OS at x. 


k-Covectors 


The standard basis of the dual of R” is 
(dxi,...,dx,], where dx;:R”—R is the linear 
functional that takes every x —(x1,...,x,) into the 
ith component x;. For every i € I(», k) we set 


dx; — dxi, ^ dx;, AeA dxi, 


and the space A^(R") of k-covectors consists of all 
formal linear combinations Xo;dx;. The exterior 
product of covectors is defined as that for vectors. 
The space A*(R") is dual to A(R”) via the duality 
pairing (;) defined by the relations. (dxi; ej} :— 6; 
(that is, 1 if ¿=j and 0 otherwise). 


Differential Forms and Stokes Theorem 


A differential form of order k on R" is a map 
w:R"— A*(R") Using the canonical basis of 
A*(R"), we can write w as 


w(x) = w(x )dx; 
icI(n,k) 


where the coordinates w; are real functions on R". 
The exterior derivative of a k-form w of class C! is 
the (k + 1)-form 


do(x):— p dw;(x) ^ dx; 


icl(n,k) 


where, for every scalar function f, df is the 1-form 


df (x)= EA Me (x)dx; 


i=] 


If S is a k-dimensional oriented surface, the 
integral of a k-form w on S is naturally defined by 


[om [were dete 


Stokes theorem states that for every (k — 1)-form w 
of class C! there holds 


f o= [ aw [10] 
as Js 


provided that OS is endowed with the orientation Tas 
that satisfies [9]. 


Currents 


The definition of k-dimensional currents closely 
resembles that of distributions: they are the dual of 
smooth k-forms with compact support. Since every 
oriented k-dimensional surface defines by integration 
a linear functional on forms, currents can be regarded 
as generalized oriented surfaces. As every distribution 
admits a derivative, so every current admits a 
boundary. Indeed, many other basic notions of 
homology theory can be naturally extended to 
currents — this was actually one of the motivations 
behind the introduction of currents, due to de Rham. 
For the applications to variational problems, 
smaller classes of currents are usually considered; 
the most relevant to the Plateau problem is that of 
integral currents. Note that the definitions of the 
spaces of normal, rectifiable, and integral currents 
and the symbols used to denote them vary, some- 
times more than slightly, depending on the author. 


Currents, Boundary, and Mass 


Let n,k be integers with n > k. The space of 
k-dimensional currents on R”, denoted by ^,(R"), 
is the dual of the space ”*(R”) of smooth k-forms 
with compact support in R”. For k>1, the 
boundary of a k-current T is the (k — 1)-current OT 
defined by 


(OT;w):— (T;dw) for every w € 2^" (R") [11] 


while the boundary of a O-current is set equal to 0. 
The mass of T is the number 


M(T):— sup{ (T;w): w € 5* (R^), lw] < 1} 12] 


Fundamental examples of k-currents are oriented 
k-dimensional surfaces: with each oriented surface 
S of class C! is canonically associated the current 
(T; du):— J w (in fact, S is completely determined 
by the action on forms, i.e., by the associated 


current). By Stokes theorem, the boundary of T is 
the current associated with the boundary of S; thus, 
the notion of boundary for currents is compatible 
with the classical one for oriented surfaces. 
A simple computation shows that M(T)= ¥*(S); 
therefore, the mass provides a natural extension 
of the notion of k-dimensional volume to 
k-currents. 


Remarks 


(i) Not all k-currents look like k-dimensional 
surfaces. For example, every k-vectorfield v: R" — A, 
(R”) defines by duality the k-current 


TA f (wx); w(x)) dæ” (x) 


The mass of T is f|v|d>”, and the boundary is 
represented by a similar integral formula involving 
the partial derivatives of v (in particular, for 
1-vectorfields, the boundary is the O-current asso- 
ciated with the divergence of v). Note that the 
dimension of such T is k because k-vectorfields act 
on k-forms, and there is no relation with the 
dimension of the support of T, which is n. 

(ii) To be precise, 7^(R") is a locally convex 
topological vector space, and v,(R") is its topolo- 
gical dual. As such, 7,(R") is endowed with a dual 
(or weak*) topology. We say that a sequence of 
k-currents (T;) converge to T if they converge in the 
dual topology, that is, 


(Ti;w) — (T;w) for every w € 7^ (R") [13] 


Recalling the definition of mass, it is easy to show 
that it is lower-semicontinuous with respect the dual 
topology, and in particular 


lim inf M(T;) > M(T) [14] 


Currents with Finite Mass 


By definition, a k-current T with finite mass is a 
linear functional on k-forms which is bounded with 
respect to the supremum norm, and by Riesz 
theorem it can be represented as a bounded measure 
with values in A(R”). In other words, there exist a 
finite positive measure j; on R” and a density 
function 7: R" — A,(R") such that |r(x)|=1 for 
every x and 


(Tsu) = fee), T) dut) 


The fact that currents are the dual of a separable 
space yields the following compactness result: a 
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sequence of k-currents (T;) with uniformly bounded 
masses M(T;) admits a subsequence that converges 
to a current with finite mass. 


Normal Currents 


A k-current T is called normal if both T and OT 
have finite mass. The compactness result stated in 
the previous paragraph implies the following com- 
pactness theorem for normal currents: a sequence of 
normal currents (T; with M(T; and M(OT;) 
uniformly bounded admits a subsequence that 
converges to a normal current. 


Rectifiable Currents 


A k-current T is called rectifiable if it can repre- 
sented as 


Tu f (wla); r(x))8(x) doe (x) 


where E is a k-rectifiable set E, 7 is an orientation of 
E — that is, r(x) is a simple unit k-vector that spans 
Tan(E, x) for ~*-almost every x € E — and @ is a real 
function such that f, |6|d 7^ is finite, called multi- 
plicity. Such T is denoted by T=[E,7,0]. In 
particular, a rectifiable 0-current can be written as 
(T; w) = 06jw(x;), where E = {x;} is a countable set in 
R” and {0;} is a sequence of real numbers with 
10; | « roo. 


Integral Currents 


If T is a rectifiable current and the multiplicity 0 
takes integral values, T is called an integer multi- 
plicity rectifiable current. If both T and OT are 
integer multiplicity rectifiable currents, then T is an 
integral current. 

The first nontrivial result is the boundary rectifia- 
bility theorem: if T is an integer multiplicity 
rectifiable current and OT has finite mass, then OT 
is an integer multiplicity rectifiable current, too, and 
therefore T is an integral current. 

The second fundamental result is the compactness 
theorem for integral currents: a sequence of integral 
currents (T;) with M(T;) and M(OT;) uniformly 
bounded admits a subsequence that converges to 
an integral current. 


Remarks 


(i) The point of the compactness theorem for 
integral currents is not the existence of a converging 
subsequence — that being already established by the 
compactness theorem for normal currents — but the 
fact that the limit is an integral current. In fact, this 
result is often referred to as a “closure theorem” 
rather than a *compactness theorem." 
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Figure 5 T is the normal 1-current on R? associated with the 
vectorfield equal to the unit vector e on the unit square (2, and 
equal to 0 outside. 7; are the rectifiable currents associated with 
the sets E; (middle) and the constant multiplicity 1/j, and then 
M(T;) 21, M(9T;j) 22. T; are the integral currents associated 
with the sets E; (left) and the constant multiplicity 1, and then 
M(T/) —1, M(9TJ) — 2f*. Both (T;) and (T/) converge to T. 


(ii) The following observations may clarify the 

role of assumptions in the compactness theorem: 
(1) a sequence of integral currents (T;) with M(T;) 
uniformly bounded - but not M(OT; - may 
converge to any current with finite mass, not 
necessarily a rectifiable one. 
(2) A sequence of rectifiable currents (T;) with 
rectifiable boundaries and M(T;), M(OT;) uniformly 
bounded may converge to any normal current, 
not necessarily a rectifiable one. Examples of both 
situations are described in Figure 5. 


Application to the Plateau Problem 


The compactness result for integral currents implies 
the existence of currents with minimal mass: if I is 
the boundary of an integral k-current in R",1 € k < 
n, then there exists a current T of minimal mass 
among those that satisfy OT —T. 

The proof of this existence result is a typical 
example of the direct method: let m be the infimum 
of M(T) among all integral currents with boundary 
L, and let (T;) be a minimizing sequence (i.e., a 
sequence of integral currents with boundary T such 
that M(T;) converges to m). Since M(T;) is bounded 
and M(OT;)— M(T) is constant, we can apply the 
compactness theorem for integral currents and 
extract a subsequence of (T;) that converges to an 
integral current T. By the continuity of the boundary 
operator, OT = lim 0T; 2T, and by the semiconti- 
nuity of the mass M(T) € lim M(T;)=m (cf. [14]). 
Thus, T is the desired minimal current. 


Remarks 


(i) Every integral (k—1)-current T with null 
boundary and compact support in R" is the boundary 
of an integral current, and therefore is an admissible 
datum for the previous existence result. 

(ii) A mass-minimizing integral current T is more 
regular than a general integral current. For k=n — 1, 
there exists a closed singular set $ with dimy (S) € 


k — 7 such that T agrees with a smooth surface in the 
complement of $ and of the support of the boundary. 
In particular, T is smooth away from the boundary 
for n € 7. For general k, it can only be proved 
that dimp (S) € k — 2 Both results are optimal: in 
R^ x R*, the minimal 7-current with boundary T := 
{|x| 5 |y| 2 1) - a product of two 3-spheres — is the 
cone T :— (|x| ^ |y| € 1}, and is singular at the origin. 
In R? x R?, the minimal 2-current with boundary 
D:-(x-0,|ly|21)U(y-0,|x| 2 1) = a union of 
two disjoint circles — is the union of the disks 
{x =0, |y| € 1} U{y=0, |x| € 1}, and is singular at 
the origin. 

(iii) In certain cases, the mass-minimizing current 
T may not agree with the solution of the Plateau 
problem suggested by intuition. The first reason is 
that currents do not include nonorientable surfaces, 
which sometimes may be more convenient (Figure 6). 
Another reason is that the mass of an integral 
current T associated with a k-rectifiable set E does 
not agree with the measure 7*(E) — called size of T 
— because multiplicity must be taken into account, 
and for certain T the mass-minimizing current may 
be not size-minimizing (Figure 7). Unfortunately, 
proving the existence of size-minimizing currents is 
much more complicated, due to lack of suitable 
compactness theorems. 

(iv) For k=2, the classical approach to the 
Plateau problem consists in parametrizing surfaces 
in R" by maps f from a given two-dimensional 
domain D into R", and looking for minimizers of 
the area functional 


人 /de(vf* vf) 


Figure 6 The surface with minimal area spanning the 
(oriented) curve T is the Mobius strip X. However, X is not 
orientable, and cannot be viewed as a current. The mass- 
minimizing current with boundary T is X. 


Figure 7 The boundary T is a O-current associated with four 
oriented points. The size (length) of T is smaller than that of 7’. 
However, OT =T implies that the multiplicity of T must be 2 on 
the central segment and 1 on the others; thus the mass of T is 
larger than its size. The size-minimizing current with boundary I 
is T, while the mass-minimizing one is T”. 


Figure 8 The surface X minimizes the area among surfaces 
parametrized by the disk with boundary r. The mass-minimizing 
current Y' can only be parametrized by a disk with a handle. 
Note that © is a singular surface, while X’ is not. 


Figure 9 Two possible soap films spanning the wire T: unlike 
X, X' cannot be viewed as a current with multiplicity 1 and 
boundary T. 


(recall the area formula, discussed earlier) under the 
constraint f(OD) —T. In this framework, the choice 
of the domain D prescribes the topological type of 
admissible surfaces, and therefore the minimizer 
may differ substantially from the mass-minimizing 
current with boundary T (Figure 8). 

(v) For some modeling problems, for instance, 
those related to soap films and soap bubbles, currents 
do not provide the right framework (Figure 9). A 
possible alternative are integral varifolds (cf. Almgren 
2001). However, it should be pointed out that this 
framework does not allow for *easy" application of 
the direct method, and the existence of minimal 
varifolds is in general quite difficult to prove. 


Miscellaneous Results and Useful Tools 


(1) An important issue, related to the use of currents 
for solving variational problems, concerns the extent 
to which integral currents can be approximated by 
regular objects. For many reasons, the “right” regular 
class to consider are not smooth surfaces, but integral 
polyhedral currents, that is, linear combinations with 
integral coefficients of oriented simplexes. The follow- 
ing approximation theorem holds: for every integral 
current T in R" there exists a sequence of integral 
polyhedral currents (T;) such that 


D 8 oT 
M(T;) ^ M(T), M(8T;) 一 M(8T) 


The proof is based on a quite useful tool, called 
polyhedral deformation. 

(ii) Many geometric operations for surfaces have an 
equivalent for currents. For instance, it is possible to 
define the image of a current in R" via a smooth proper 
map f : R” — R”. Indeed, with every k-form w on R” 
is canonically associated a k-form f*w on R”, called 
pullback of w according to f. The adjoint of the 
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pullback is an operator, called push-forward, that 
takes every k-current T in R” into a k-current fT in 
R”. If T is the rectifiable current associated with a 
rectifiable set E and a multiplicity 0, the push-forward 
fT is the rectifiable current associated with f (E) - and 
a multiplicity 0'(y) which is computed by adding up 
with the right sign all 6(x) with x € f !(y). As one 
might expect, the boundary of the push-forward is the 
push-forward of the boundary. 

(iii) In general, it is not possible to give a meaning 
to'the intersection of two currents, and not even of 
a current and a smooth surface. However, it is 
possible to define the intersection of a normal 
k-current T and a level surface f^!(y) of a smooth 
map f: R"— R^ (with k <b <n) for almost every 
y, resulting in a current T, with the expected 
dimension h — k. This operation is called slicing. 

(iv) When working with currents, a quite useful 
notion is that of flat norm: 


F(T) := inf (M(R) + M(S T = R + 0$] 


where T and R are k-currents, and S is a (k + 1)- 
current. The relevance of this notion lies in the fact 
that a sequence (T;) that converges with respect to 
the flat norm converges also in the dual topology, 
and the converse holds if the masses M(T;) and 
M(OT;) are uniformly bounded. Hence, the flat 
norm metrizes the dual topology of currents (at 
least on sets of currents where the mass and the 
mass of the boundary are bounded). 

Since F(T) can be explicitly estimated from above, it 
can be quite useful in proving that a sequence of 
currents converges to a certain limit. Finally, the flat 
norm gives a (geometrically significant) measure of how 
far apart two currents are: for instance, given the O- 
currents 6, and 6, (the Dirac masses at x and y, 
respectively), then F(x — ôy) is exactly the distance 
between x and y. 


See also: Free Interfaces and Free Discontinuities: 
Variational Problems; l'-Convergence and 
Homogenization; Geometric Phases; Image Processing: 
Mathematics; Minimal Submanifolds; Mirror Symmetry: 
A Geometric Survey; Moduli Spaces: An Introduction. 
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Introduction 


We invite the reader to perform the following simple 
experiment. Put your arm out in front of you keeping 
your thumb pointing up perpendicular to your arm. 
Move your arm up over your head, then bring it down 
to your side, and at last bring the arm back in front of 
you again. In this experiment an object (your thumb) 
was taken along a closed path traced by another object 
(your arm) in a way that a simple local law of transport 
was applied. In this case the local law consisted of two 
ingredients: (1) preserve the orthogonality of your 
thumb with respect to your arm and (2) do not rotate 
the thumb about its instantaneous axis (1.e., your arm). 
Performing the experiment in this way, you will 
manage to avoid rotations of your thumb locally; 
however, in the end you will experience a rotation of 
90° globally. 

The experiment above can be regarded as the 
archetypical example of the phenomenon called 
anholonomy by physicists and holonomy by math- 
ematicians. In this article, we consider the manifes- 
tation of this phenomenon in the realm of quantum 
theory. The objects to be transported along closed 
paths in suitable manifolds will be wave functions 
representing quantum systems. After applying local 
laws dictated by inputs coming from physics, one 
ends up with a new wave function that has picked 
up a complex phase factor. Phases of this kind are 
called geometric phases, with the famous Berry 
phase being a special case. 


The Space of Rays 


Let us consider a quantum system with physical 
states represented by elements |i) of some Hilbert 
space H with scalar product (D: x 一 C. For 
simplicity, we assume that H is finite dimensional, 
H ~ C"*! with n > 1. The infinite-dimensional case 
can be studied by taking the inductive limit n — oo. 


Morgan F (2000) Geometric Measure Theory. A Beginner's 
Guide, 3rd edn. San Diego: Academic Press. 

Simon L (1983) Lectures on Geometric Measure Theory. 
Proceedings of the Centre for Mathematical Analysis, 3. 
Australian National University, Centre for Mathematical 
Analysis, Canberra 1983. 


Let us denote the complex amplitudes characterizing 
the state |) by Z^,0 —0,1,...,7. For a normalized 
state, 


|l — (ys dp2°Z m Z.Z^ —1 [1] 


where summation over repeated indices is understood, 
indices raised and lowered by 6^" and ag, respectively, 
and the overbar refers to complex conjugation. A 
normalized state lies on the unit sphere S ~ S?"*! in 
C"*!. Two nonzero states |) and |) are equivalent, 
Jv) ~ |p), iff they are related as |v) = A|g) for some 
nonzero complex number A. For equivalent states, 
physically meaningful quantities such as 


(AI) — (ler 2 
(v) 站 el 


(mean value of a physical quantity represented by a 
Hermitian operator A, transition probability from a 
physical state represented by |) to one represented 
by |y~)) are invariant. Hence, the real space of states 
representing the physical states of a quantum system 
unambiguously is the set of equivalence classes P = 
H/~ .P is called the “space of rays.” For H ~ C"*!, 
we have P ~ CP”, where CP" is the n-dimensional 
complex projective space. For normalized states, |i) 
and |o) are equivalent iff |v) = Alp}, where |A| = 1, 
that is, A € U(1). Thus, two normalized states are 
equivalent iff they differ merely in a complex phase. 
It is well known that S can be regarded as the total 
space of a principal bundle over P with structure 
group U(1). This means that we have the projection 


tm: |v) ESCH — |p| EP [3] 


where the rank-1 projector |v)(w| represents the 
equivalence class of |). Since we will use this bundle 
frequently in this article, we call it 7, (the meaning of 
the subscript 1 will be clarified later). Then, we have 


m:U(1)s—? [4] 


For Z? Æ 0 the space of rays P can be given local 
coordinates 


w = ZZ, $—43,..,m [5] 


The w’ are inhomogeneous coordinates for CP" on 
the coordinate patch Uo defined by the condition 
Z“ #0, 

P is a compact complex manifold with a natural 
Riemannian metric g. This metric g is induced from 
the scalar. product on H. Let us consider the 
construction of g by using the physical input 
provided by the invariance of the transition prob- 
ability of [1]. For this we define a distance between 
lw) Q»| and |w~){y| in P as follows: 


_ Ilo 
|l lel 


This definition makes sense since, due to the 
Cauchy-Schwartz inequality, the right-hand side of 
[6] is non-negative and <1. It is equal to 1 iff |) is 
a nonzero complex multiple of |p}, that is, iff they 
define the same point in 7. Hence in this case, 
6(w, q) — 0 as expected. 

Suppose now that |i) and |i») are separated by an 
infinitesimal distance ds = 6(w, p). Putting this into 
the definition [6], using the local coordinates w of 
[5] for |w) and w’ + dw’ for |) after expanding both 
sides using Taylor series, one gets 


cos" (&(v» y)/2) [6] 


ds? = 4g; dw! dw, ,k=1,2,...,n [7] 
where 


— (14 wpwe')b, — ww, 
x" . . i. ^— [8] 
(1+ mw) 
with dwt = di^. The line element [7] defines the 
Fubini-Study metric for P. 


The Pancharatnam Connection 


Having defined the basic entity, the space of rays 7, 
and the principal U(1) bundle 7,, now we define a 
connection giving rise to a local law of parallel 
transport. This approach gives rise to a very general 
definition of the geometric phase. In the mathema- 
tical literature, the connection defined below is 
called the “canonical connection" on the principal 
bundle. However, since the motivation is coming 
from physics, we are going to rediscover this 
construction using merely physical information 
provided by quantum theory alone. 

The information needed is an adaptation of Pan- 
charatnam’s study of polarized light to quantum 
mechanics. Let us consider two normalized states |W) 
and |). When these states belong to the same ray, then 
we have |v) — e'^|o) for some phase factor e't; hence, 
the phase difference between them can be defined to be 
just 9. How to define the phase difference between |v) 
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and |) (not orthogonal) when these states belong to 
different rays? To compare the phases of 
nonorthogonal states belonging to different rays, 
Pancharatnam employed the following simple rule: 
two states are “in phase" iff their interference is 
maximal. In order to find the state |p) = e'®|y’) from 
the ray spanned by the representative |y’) which is “in 
phase" with |), we have to find a ó modulo 27 for 
which the interference term in 


w+esp'll —2(--Re(e^(v|g)) [9] 


is maximal. Obviously the interference is maximal 
iff e? (y|ọ') is a real positive number, that is, 


"E RAI RM 
“Ter =i Ut 

Hence for the state |) “in phase" with |y), one has 
(veg) = (ve?) ER [11] 


When such |) and |) = |ù + dv) are infinitesi- 
mally separated, from [11] it follows that 


Im(v|dv) = ; (Za dZ^ 一 dz, Z7) zt [12] 
where Z,Z^ — ZoZ?(1-4-ivjw!) — 1 due to normal- 
ization. Writing Z? = |Z?|e'* using [5], one obtains 


w;dw! 
: z [13] 


Im (v;|dv) = d + A = 0, [wow 


4 三 Im 
In order to clarify the meaning of the 1-form A, 
notice that the choice 


1 1 

emu) m 
1+ iv, w? LU! 

defines a local section of the bundle 7,. In terms of 

this section, the state |) can be expressed as 


w= (3) =e" (3,) =e) us 


For a path w(t) lying entirely in Uo CP, 
blt) =e lu'(t)) defines a path in S with a 
(t) satisfying the equation 6 + A — 0. For a closed 
path C, the equation above defines a (generically) 
open path [ projecting onto C by the projection 7. 
It must be clear by now that the process described is 
the one of parallel transports with respect to a 
connection with a connection 1-form w. The pull- 
back of w with respect to the local section in [14] is 
the 1-form (U(1) gauge field) A in [13]. The curve T 
corresponding to |v(t)) is the horizontal lift of C in 
P. The U(1) phase 


eile =e i fA [16] 
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is the holonomy of the connection. We call this 
connection the “Pancharatnam connection,” and its 
holonomy for a closed path in the space of rays is 
the geometric phase acquired by the wave function. 
Now the question of fundamental importance is: 
how to realize closed paths in P physically? This 
question is addressed in the following sections. 


Quantum Jumps 


We have seen that physical states of a quantum 
system are represented by the space of rays P and 
normalized states used as representatives for 
such states form the total space S of a principal 
U(1) bundle 7, over P. Moreover, in the previous 
section we have realized that the physical 
notions of transition probability, and quantum 
interference naturally lead to the introduction of a 
Riemannian metric g and an abelian U(1) gauge 
field A living on P. 

An interesting result based on the connection 
between g and A concerns a nice geometric descrip- 
tion of a special type of quantum evolution consist- 
ing of a sequence of “quantum jumps.” 

Consider two nonorthogonal rays |A)(A| and 
IB)(B| in P. Let us suppose that the system's 
normalized wave function initially is |A) € S, and 
measure by the “polarizer” |B)(B|. Then the result of 
this filtering measurement is |B)(B|A), or after 
projecting back to the set of normalized states we 
have the *quantum jump" 


|A) — |B) on 


[17] 


Now we have the following theorem: 


Theorem The [17] jump can be recovered by 
parallel transporting the normalized state |A) 
according to the Pancharatnam connection along 
the shortest geodesic (with respect to the [8] metric), 
connecting |A)(A| and |B)(B| in P. 


Let us now consider a cyclic series of filtering 
measurements with projectors |A,)(A,|, a — 1,2,..., 
N +1, where |A;)(Ai|=|An+1)(An+i|. Prepare the 
system in the state |A1) € S, and then subject it to 
the sequence of filtering measurements. Then 
according to the theorem, the phase 


iœ _ (Ai |An)(An|An-1) =: + (A2|A1) 18] 
|(A1|An)(An|An-1) +- (A2|A1)| 


picked up by the state is equal to the one obtained 
by parallel transporting |A,) along a geodesic 
polygon consisting of the shorter arcs connecting 


the projectors |A;,)(A;| and |Agy1)(Agii| with 
a=1,2,...,N. It is important to realize that this 
filtering measurement process is not a unitary one; 
hence, unitarity is not essential for the geometric 
phase to appear. 

In this section we have managed to obtain closed 
paths in the form of geodesic polygons in P via the 
physical process of subjecting the initial state |A1) to 
a sequence of filtering measurements. It is clear that 
for any type of evolution, the geodesics of the 
Fubini-study metric play a fundamental role since 
any smooth closed curve in P can be approximated 
by geodesic polygons. 

Nonunitary evolution provided by the quantum 
measuring process is only half of the story. In the 
next section, we start describing closed paths in P 
arising also from unitary evolutions generated by 
parameter-dependent Hamiltonians, the original 
context where geometric phases were discovered. 


Unitary Evolutions 
Adiabatic Evolution 


Suppose that the evolution of our quantum system 
with H ~ C"*! is generated by a Hermitian Hamil- 
tonian matrix depending on a set of external 
parameters x“,y=1,2,...,M. Here we assume 
that the x^ are local coordinates on some coordinate 
patch V of a smooth M-dimensional manifold M. 
We lable the eigenvalues of H(x) by the numbers 
r —0,1,2,...,7, and assume that the rth eigenvalue 
E,(x) is nondegenerate: 


H(x)|r, x) = E,(x)|r,x), 


We assume that H(x), E,(x), |r,x) are smooth func- 
tions of x. The rank-1 spectral projectors 


PSx)mlxxinx, f--0,1,9,..,m- [20] 
for each r define a map f: M 一 P: 
l:xevVcMneoeP,.x)e?P [21] 


r-0,1,2,...," [19] 


Recall now that we have the bundle 7, over P, at 
our disposal, and we can pull back 7, using the map 
f, to construct a new bundle £7 over the parameter 
space M. Moreover, we can define a connection on 
£ by pulling back the canonical (Pancharatnam) 
connection of 7,. The resulting bundle £7 is called 
the Berry-Simon bundle over the parameter space 
M. Explicitly, 


& :U(1) 5 & SM [22] 


The states |r, x) of [19] define a local section of £;. 
Supressing the index r, the relationship between 7j 


and £, can be summarized by 
diagram: 


the following 


Si 4—— wn 


«| | 23 


Ms P 


Here f* denotes the pullback map, and we have £, = 
f' (m). (We have denoted the total space S as 7j.) 

The local section of £, arising as the pullback of 
[14] an m} is given by 


1 
Ir, x) = — ie ( : ) 
1+ w*(x)w,(x) \w (x) 
xEVCM [24] 
with /—1,2,...,7. The pullback of the Panchar- 


atnam connection w on 7, is f*(w). We can further 
pull back f*(w) to V C M with respect to the local 
section of [24] to obtain a gauge field living on the 
parameter space. This gauge field is called the 
*Berry gauge field" and the corresponding connec- 
tion is the Berry connection. Thus, 


A=fA)= 


here 0,,=0/0x" and A is given by [13]. When we 
have a closed curve C in M, then foC defines a 
closed curve C in P. We already know that the 
holonomy for C in P can be written in the [16] 


form; hence, 
=- $A 26 
e 


a=- A=— ffA) 


This formula states that there is a geometric phase 
picked up by the eigenstates of a parameter- 
dependent Hermitian Hamiltonian when we change 
the parameters along a closed curve. Our formula 
shows that the geometric phase can be calculated 
using either the canonical connection on 7, or the 
Berry connection on £,. 

Let us then change the parameters x^ adiabati- 
cally. The closed path in parameter space then 
defines Hamiltonians satisfying H(x(T)) ^ H(x(0)) 
for some T € R^. Moreover, there is also the 
associated closed curve P,(x(T)) - P,(x(0)) in P. 
The quantum adiabatic theorem states that if we 
prepare a state |W(0)) = |r, x(0)) at t= 0, which is an 
eigenstate of the instantaneous Hamiltonian 
H(x(0), then after changing the parameters 


A, (x)dx" = (A;ð u’ + A;0,w/)dx" [25] 
/ ! nif 1-1 


Geometric Phases 531 


infinitely slowly, the time evolution generated by 
the time-dependent Schródinger equation 


d 
bU O) 


takes the form 


= H(t)|¥(t)) i27] 


|W(t)) = |r, x(t) [28] 


after time t, which belongs to the same eigensub- 
space. The point is that the theorem holds only for 
cases when the kinetic energy associated with the 
slow change in the external parameters is much 
smaller than the energy separation between E,(x) 
and E,(x) for all x € Mt. Under this assumption, 
transitions between adjacent levels are prohibited 
during evolution. Notice that the adiabatic theorem 
clearly breaks down in the vicinity of level crossings 
where the gap is comparable with the magnitude of 
the kinetic energy of the external parameters. 

However, if one takes it for granted that the 
projector P,(t) = P,(x(t)) for some r satisfies the 
Schródinger-von Neumann equation 

ib E P,(t) = 

by virtue of [19], we get zero for the right-hand side. 
This means that P,(t) is constant; hence, the curve in 
P degenerates to a point. The upshot of this is that 
exact adiabatic cyclic evolutions do not exist. It can 
be shown, however, that under certain conditions 
one can find an initial state |W(0)) Z |r, x(0)) that is 
“close enough" to P,(x(t)) — |r, x(t)) (r, x(t). Then, 
we can say that the projector analog of [28] only 
approximately holds 


(E(t) CECE) = |r, x(t)) xb) [30] 


This means that the use of the bundle picture for 
the generation of closed curves for P via the 
adiabatic evolution can merely be used as an 
approximation. 


IH (t), P,(t)] [29] 


Berry’s Phase 


The straightforward calculation after substituting 
[28] into [27] shows that 


exp(iA,(T )exp(- ;[ (at) 
exp (-i $ A”) [31] 


where C is a closed curve lying entirely in V C M. 
The first phase factor is the dynamical and the 
second is the celebrated Berry phase. Notice that the 
index r labeling the eigensubspace in question 
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should now be included in the definition of A 
(see eqn [25]). 
As an explicit example, let us take the Hamiltonian 


H(X(t)) = —woJ X(t), ^ vom z—. 
XcR^, |X|-1 [32] 


where e, m, and g are the charge, mass, and Landé 
factor of a particle, c is the speed of light, and B is 
the (constant) magnitude of an applied magnetic 
field. The three components of J are 
(2 + 1) x (2J + 1)-dimensional spin matrices satis- 
fying J xJ=ihJ. The Hamiltonian (eqn [32]) 
describes a spin / particle moving in a magnetic 
field with slowly varying direction. It is obvious that 
the parameter space is a 2-sphere. Introducing polar 
coordinates 0 € 0 < 7, 0 € x < 2r for the patch Y 
of S^ excluding the south pole, we have 
xl = 0,x* =x. 

As an illustration, let us consider the spin 1/2 case. 
Then H can be expressed in terms of the 2 x 2 Pauli 
matrices. The eigenvalues are Eyg=—woh/2 and 
E; =woh/2 (r=0,1). For the ground state, the 
mapping fo of [21] from VC M œ SÈ to P œ CP! is 
given by 


w(0, x) = tan (2) e [33] 
which is stereographic projection of S^ from the 
south pole onto the complex plane corresponding to 
the coordinate patch Uo C CP!. Using [13] and [25], 
one can calculate the pullback gauge field and its 
curvature 7 = dA), where 


, 1 1 
A” = 5 (1 —cos@)dy, FY= 5 sin 0d0 ^ dx [34] 
Notice that F'°) is the field strength of a magnetic 
monopole of strength 1/2 living on M. Using Stokes 
theorem, from [26] one can calculate Berry's phase 


5olcl = — f AO — — J FO= -ng [35] 


where S is the surface bounded by the loop C and Q[C] is 
the solid angle subtended by the curve C at X = 0. 

The above result can be generalized for arbitrary spin 
J. Then, we have the eigenvalues E, — —wob(J — r), 
where 0 < r < 2J. The final result in this case is 


ec] = -J-A O<r<2J [36] 


The Aharonov-Anandan Phase 


We have seen that the quantum adiabatic theorem 
can only be used approximately for generating 


closed curves in P. This section, describes as to 
how such curves can be generated exactly. 

Let us consider the Schrédinger equation with a 
time-dependent Hamiltonian (eqn [27]). Then we 
call its solution |W(t)) cyclic if the state of the system 
returns, after a period T, to its original state. This 
means that the projector |W(t))(W(t)| traverses a 
closed path C in P. In order to realize this situation, 
we have to find solutions of [27] for which 
|\W(T)) =e'4"|G(0)) for some Ay. 

Taking for granted the existence of such a solution, let 
us first explore its consequences. First, we remove the 
dynamical phase from the cyclic solution |W(2)) 


v) = exp (Ff SOHO ae ie) (37 


Then, |7(t)) satisfies [12], that is, it defines a unique 
horizontal lift of the closed curve C in P. Following 
the same steps as in section describing the Panchar- 
atnam condition, we see that the phase 


$AAÍC] = — pa 
Jo" ， 
=Av+; | (yO)IHO TD) dt (38 


is purely geometric in origin. It is called the 
Aharonov-Anandan (AA) phase. 

Let us now turn back to the question of finding 
cyclic states satisfying | W(T)) — e'^*|V(0)). One 
possible solution is as follows. Suppose that H 
depends on time through some not necessarily 
slowly changing parameters x. Let us find a partner 
Hamiltonian b for our H by defining a smooth 
mapping oc : M — M, such that 


x€y cu [39] 


For the special class we study here, the cyclic vectors 
are eigenvectors of h(x). Hence, the projectors p, 
and P, of b and H are related as p,(x) = P,(a(x)); this 
means that we have a map g;:M 一 P, 


g,zfoo:xeVcM- pí(x)eP [40] 


which associates with every x an eigenstate of h(x). 
Moreover, g, associates with a closed curve C in M 
a closed curve C in 7. Notice that generically 
[b(x), H(x)] 4 0; hence, cyclic states are not eigen- 
states of the instanteneous Hamiltonian. 

It should be clear by now that we can repeat the 
construction as discussed in the adiabatic case with 
g, replacing f,. In particular, we can construct a new 
bundle ¢, over the parameter space via the usual 


pullback procedure. More precisely, we have the 
corresponding diagram 


g mm 11 
«| 2 41] 
M o -—. p 
The AA connection can be obtained by pulling back 
the Pancharatnam connection: 
asg(A)-c'ef(A)- c (A) 2 


where the last equality relates the AA connection 
with the Berry connection. Now the AA phase is 


Saec) A=- PEU -- $a [43] 


As an example, let us take the Hamiltonian [32] 
with the curve C on M = S*: 


X(t) = (sinf cos(x +wt),sin@ sin(x --wt),cos0) [44] 


Here 0 and x are the polar coordinates of a fixed point 
in $? where the motion starts. The curve C is a circle of 
fixed latitude and is traversed with an arbitrary speed. 
This model can be solved exactly and it can be shown 
that the mapping o,:5? — S? is given by 


S »( u—s 
"NS JS imi A | 
u=cosé, s=— [45] 
wo 


One can prove that for 0 < s< 1,60, is a diffeo- 
morphism. In the s — 0 (the adiabatic) limit, the 
mapping g,; = frs O os is continuously deformed to 
f,. Moreover, h(x) as defined above commutes with 
the time evolution operator; hence, cyclic states are 
indeed eigenstates of h(x). 

Using [42], [43], and [45], the explicit form of os, 
we get for the AA phase 


vs? — 2us + ]) 46] 


In the adiabatic limit, the result goes to —27(J — r) 
(1 — u) which is just —(J — r) times the solid angle of 
the path of fixed latitude, as it has to be. 


fe] = -210 - (1 - 


Generalization 


In the sequence of examples, we have shown that 
geometric phases are related to the geometric struc- 
tures on the bundle 7,. The Berry and AA phases are 
special cases arising from Pancharatnam's phase via a 
pullback procedure with respect to suitable maps 
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defined by the physical situation in question. Hence, 
the Pancharatnam connection in this sense is universal. 
The root of this universality rests in a deep theorem of 
mathematics concerning the existence of universal 
bundles and their universal connections. In order to 
elaborate the insight provided by this theorem into the 
geometry of quantum evolution, let us first make a 
further generalization. 

In our study of time-dependent Hamiltonians we 
have assumed that the eigenvalues of [19] were 


nondegenerate. Let us now relax this assumption. Fix 


an integer N > 1, the degeneracy of the eigensubspace 
corresponding to the eigenvalue E,. One can then form 
a U(N) principal bundle £y over M, furnished with a 
connection, that is a natural generalization of the Berry 
connection. The pullback of this connection to a patch 
of M is a U(N)-valued gauge field and its holonomy 
along a loop in M gives rise to a U(N) matrix 
generalization of the U(1) Berry phase. 

The natural description of this connection and its AA 
analog is as follows. Take the complex Grassmannian 
Gr(n + 1, N) of N planes in C"*!. Obviously, Gr(n + 
1,1) = P. Each point of Gr(z + 1, N) corresponds to 
an N plane through the origin represented by a rank-N 
projector. This projector can be written in terms of N 
orthonormal basis vectors in an infinite number of 
ways. This ambiguity of choosing orthonormal frames 
is captured by the U(N) gauge symmetry, the analog of 
the U(1) (phase) ambiguity in defining a normalized 
state as the representative of the rank-1 projector. This 
bundle of frames is the Stiefel bundle V(n + 1, N) 
alternatively denoted by ry. V(n + 1, N) is a principal 
U(N) bundle over Gr(n+1,N) equipped with a 
canonical connection wy which is the U(N) analog of 
Pancharatnam's connection. 

Now according to the powerful theorem of Nar- 
asimhan and Ramanan if we have a U(N) bundle £y 
over the M-dimensional parameter space M, then 
there exists an integer 75 (N, M) such that for n < no 
there exists a map f:.M — Grí(n -- 1, N) such that 
nn =f*(V(n + 1, N)). Moreover, given any two such 
maps f and g, the corresponding pullback bundles are 
isomorphic if and only if f is homotopic to g. 

For the examples of the sections “Berry’s phase" 
and “The Aharonov-Anandan phase,” we have 
N=1,n=1, and M —2. Since the maps f, and g,, 
defined by the rank-1 spectral projectors of H(x) 
and h(x) for 0 €s« 1 are homotopic, the corre- 
sponding pullback bundles £1 and ¢, are isomorphic. 
Moreover, the Berry and AA connections are the 
pullbacks of the universal connection on V(n + 
1,1) =, which is just Pancharatnam's connection. 

For the infinite-dimensional case, one can define 
Gr(oo,N) by taking the union of the natural 
inclusion maps of Gr(m,N) into Gr(n+ 1,N). 
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We denote this universal classifying bundle V(oo, N) 
as 7. Then, we see that given an N-dimensional 
eigensubspace bundle over M and a map f,:x€ 
M = P,(x) € Gr(oo, N) defined by the physical 
situation, the geometry of evolving eigensubspaces 
can be understood in terms of the holonomy of the 
pullback of the universal connection on 7. 


Conclusions 


In this article, we elucidate the mathematical origin of 
geometric phases. We have seen that the key observa- 
tion is the fact that the space of rays P represents 
unambiguously the physical states of a quantum 
system. The particular representatives of a class in P 
belonging to the usual Hilbert space H form (local) 
sections of a U(1) bundle 7,. Based on the physical 
notions of transition probability and interference, 7, 
can be furnished with extra structures: the metric and 
the connection, the latter giving rise to a natural 
definition of parallel transport. We have seen that the 
geodesics of P with respect to the metric play a 
fundamental role in approximating evolutions of any 
kind, giving rise to a curve in P. 

The geometric structures of 7, induce similar 
structures for pullback bundles. These bundles encap- 
sulate the geometric details of time evolutions gener- 
ated by Hamiltonians that depend on a set of 
parameters x belonging to a manifold M. It was 
shown that the famous examples of Berry and AA 
phases arise as an important special case in this 
formalism. A generalization of evolving N-dimen- 
sional subspaces based on the theory of universal 
connections can also be given. This shows that the 
basic structure responsible for the occurrence of 
anholonomy effects in evolving quantum systems is 
the universal bundle 7 which is the bundle of subspaces 
of arbitrary dimension N in a Hilbert space. 

The important issue of applying the idea of 
anholonomy to physical problems has not been 
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Introduction 


The equations of geophysical fluid dynamics are the 
equations governing the motion of the atmosphere and 
the ocean, and are derived from the conservation 
equations from physics, namely conservation of mass, 


dealt with in this article. There are spectacular 
applications such as holonomic quantum computa- 
tion, the gauge kinematics of deformable bodies, 
quantum Hall-effect, fractional spin and statistics. 
The interested reader should consult the vast 
literature on the subject or as a first glance, the 
book of Shapere and Wilczek (1989). 


See also: Fractional Quantum Hall Effect; Geometric 
Measure Theory; Holomorphic Dynamics; Moduli 
Spaces: An Introduction. 
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momentum, energy, and some other components such 
as salt for the ocean, humidity (or chemical pollutants) 
for the atmosphere. 

The first assumption used in any circulation 
model is the well-accepted Boussinesq approxima- 
tion, that is, the density differences are neglected in 
the system except in the buoyancy term and in the 
equation of state. The resulting system is the so- 
called Boussinesq equations (Pedlosky 1987). Due to 
the extremely high accuracy of this approximation, 
these equations are considered as the basic equations 


in geophysical dynamics. From the computational 
point of view, however, the Boussinesq equations 
are still not accessible. 

Owing to the difference of sizes of the vertical and 
horizontal dimensions, both in the atmosphere and in 
the ocean (10-20 km versus several thousands of 
kilometers), the second approximation is based on the 
smallness of the vertical length scales with respect to 
the horizontal length scales, that is, oceans (and the 
atmosphere) compose very thin layers. The scale 
analysis ensures that the dominant forces in the 
vertical-momentum equation come from the pressure 
gradient and the gravity. This leads to the so-called 
hydrostatic approximation, which amounts to repla- 
cing the vertical component of the momentum equa- 
tion by the hydrostatic balance equation, and hence 
leading to the well-accepted primitive equations (PEs) 
(Washington and Parkinson 1986). As far as we 
know, the primitive equations were first considered 
by LF Richardson (1922); when it appeared that 
they were still too complicated they were left out 
and, instead, attention was focused on even simpler 
models, the geostrophic and quasigeostrophic mod- 
els, considered in the late 1940s by J von Neumann 
and his collaborators, in particular J G Charney. 
With the increase of computing power, interest 
eventually returned to the PEs, which are now the 
core of many global circulation models (GCMs) or 
ocean global circulation models (OGCMs), avail- 
able at the National Center for Atmospheric 
Research (NCAR) and elsewhere. GCMs and 
OGCMs are very complex models which contain 
many components, but still, the PEs are the central 
component for the dynamics of the air or the water. 
Further approximations based on the fast rotation 
of the Earth implying the smallness of the Rossby 
number lead to the quasigeostrophic and goes- 
trophic equations (Pedlosky 1987). 

The mathematical study of the PEs was initiated by 
Lions, Temam, and Wang in the early 1990s. They 
produced a mathematical formulation of the PEs 
which resembles that of the Navier-Stokes due to 
Leray, and obtained the existence, for all time, of weak 
solutions (see Lions et al. 1992a, b, 1993, 1995). 
Further works conducted during the 1990s have 
improved and supplemented these early results bring- 
ing the mathematical theory of the PEs to that of the 
three-dimensional incompressible Navier-Stokes 
equations (Constantin and Foias 1998, Teman 2001). 
In summary, the following results are now available 
which will be presented in this article: 


1. existence of weak solutions for all time; 

2. existence of strong solutions in space dimension 
three, local in time; 

3. existence and uniqueness of a strong solution in 
space dimension two, for all time; and 
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4. uniqueness of weak solutions in space dimension 
two. 


The PEs of the Ocean 


The ocean is made up of a slightly compressible 
fluid subject to a Coriolis force. The full set of 
equations of the large-scale ocean are the following: 
the conservation of momentum equation, the con- 
tinuity equation (conservation of mass), the thermo- 
dynamics equation, the equation of state and the 
equation of diffusion for the salinity S: 


pV3 + 29x Vs + Vsp-- pg =D [1] 
da pdiv3V3 — 0 p 

eT - Qr 3 

e = Qs 4 
p=f(T,S,p) i5] 


Here V3 is the three-dimensional velocity vector, 
V3= (u,v,w), p, p, T are respectively, the density, 
pressure, and temperature, and S$ is the concentra- 
tion of salinity; g = (0,0, g) is the gravity vector, D 
the molecular dissipation, Or and Os are the heat 
and salinity diffusions, respectively. 


Remark 1 The equation of state for the oceans is 
derived on a phenomenological basis. Only empirical 
forms of the function f(T,S,p) are known (see 
Washington and Parkinson (1986)). It is natural, 
however, to expect that p decreases if T increases and 
that p increases if S increases. The simplest law is 


p = po(1 — @r(T — Ti) + Bs(S — S.)) [6] 


corresponding to a linearization around reference 
values po, T;, S, of respectively, the density, tem- 
perature, and the salinity, Gy and Bs are positive 
expansion coefficients. 


The Mach number for the flow in the ocean is not 
large and, therefore, as a starting point, we can 
make the so-called Boussinesq approximation in 
which the density is assumed constant, p= po, 
except in the buoyancy term and in the equation of 
state. This amounts to replacing [1], [2] by 


dV 
po, + 2po x V3 + Vap + pg =D [7] 


div3V3 e [8] 


Furthermore, since for large-scale ocean, the horizon- 
tal scale is much larger than the vertical one, a scale 
analysis (Pedolsky 1987) shows that Op /Oz and pg are 
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the dominant terms in the vertical-momentum equa- 
tion, leading to the hydrostatic approximation 


Op — 
az = K [9] 


For mid-latitude regional studies, it is usual to 
consider the beta-plane approximation of the equations. 
Thus, we assume that the ocean fills a domain M. of RÌ. 
The top of the ocean is a domain Ti included in the 
surface of the earth S, (sphere of radius a centered at 
0). The bottom T, of the ocean is defined by (z = x35 = 
r—a), z= —eh(0,y~), where £ > 0 is a positive para- 
meter. It is introduced to take into consideration the 
smallness of the vertical scales compared to the 
horizontal scales. h is a function of class C? at least on 
I; it is assumed also that b is bounded from below, that 
is, 0 < b < b(0,v) < b, (0,9) € Tj. The lateral surface 
TI consists of the part of cylinder {(@,y~) € OT, 
—eh(0,y) < r € 0]. The PEs of the ocean are given by 


Ov Ov 1 
wt 
+20 sin dk x v — pypAv—v at» [10] 
Hv vas v 
Op Ow 
Ae div Vt = [11] 
OT OT v art 
Sr Ve ol + a da pra 一 VT 2 =Fr [12] 
os OS os 
0 
«f. vdz=0 14] 
—h 


0 
p=p;+P, P= =g | paz [15] 


p = po(1— Br(T — T) -- 8s(S— S.)) [16] 
f SáM, =0 17 
M. 


where v is the horizontal velocity of the water, w is the 
vertical velocity, and T,, S, are averaged (or reference) 
values of T and S. The diffusion coefficients pip, jr, Hs 
and vy, vr, vs are different in the horizontal and 
vertical directions, accounting for some eddy diffu- 
sions in the sense of Smagorinsky (1962). Note that 
F,, Fr, and Fs correspond to volumic sources of 
horizontal momentum, heat, and salt, respectively. 


Boundary conditions 


There are several sets of natural boundary condi- 
tions that one can associate to the PEs; for instance, 
the following: 


On the top of the ocean l';(z — 0) 


vw + a,(v — Va) = Tp, w=) 
OT Os s 
— T — T,;) — 0, -一 一 
VT De + ar( ) Dz 0 
At the bottom of the ocean T (z= —b(0,w)) 
OT Os 
g= 0, 这 一 :外 AE reda [19] 


On the lateral boundary I[)={—h(6,y~) <z< 0, 
(9, p) € OT i] 
jT i 
cahiw-6 cg 2.0 po 
Ont S 
Here n—("nu,",) is the unit outward normal on 
ƏM- decomposed into its horizontal and vertical 
components; the conormal derivatives 0/Ony; and 
O/Ons are those associated with the linear (tempera- 
ture and salinity) operators, 


0 
— = urny : V + vrn 


X un [21] 
ona hsna: V + vsnz Dz 
Equations [10]-[17] with boundary conditions 


[18]-[20] are supplemented with the initial conditions 
由 -0 = Uo, Tl,o = To, 9|,-0 = So [22] 


where vo, To, So are given initial data. 

Following the work of Lions et al. (1992a, b, 
1993, 1995) (see also Temam and Ziane (2004)), 
we introduce the following function spaces V — 
Vi x Vo x V3, H=H; x Hə x H3, where 


0 
Vi = lv € HM) div | vdz = 0, 


v —0 on nun 
V2 = H'(M) 


V3 = H'(M) = {Se H'(M 


m), f sdM = 0! 


Hı = {ve (M Run 


0 
ny | v dz = 0 on OT; (i.e., on rn)\ 


一 六 
wf. sdM = 0l 


Hy = L*(M) 
Hs = L?(M) = 
The global existence of weak solutions is estab- 
lished in Lions et al. (1992b), using the Galerkin 
method and assuming the H?-regularity of the GFD- 
Stokes problem, which was established in Ziane 


{SelM 


(1995). A more general global existence result based 
on the method of finite differences in time and 
independent of the H?-regularity is established in 
Temam and Ziane (2004), which we state here. 


Theorem 2 Given t, >0, Ug in H, and F=(F,, 


Fr, Fs) in L7(0,t1;H); g—gy, BT IS given in L^(0, ti; 
(L2(T;)?). Then there exists 


U € L™(0,t1;H) N.L*(0, ti; V) [23] 


which is a weak solution of [10]-[17] and [18]-[20], 
[22]; furtbermore, U is weakly continuous from 
[0,44] into H. 


Strong Solutions 


The local existence and uniqueness of strong 
solutions of the primitive equations of the ocean 
relies on the H?-regularity of the stationary linear 
primitive equations associated to [10]-[17]: 


"mL + 20 sin 0k x v — ply ^v — uA = F, 
0 ' [24] 
div v dz = 0 
—pur AT — M us = Fr 
Oz- 25 
Ss [25] 
—jis AS = "sog Fs 


人 三 PPS = | paz [26] 


with boundary conditions [18]-[20]. Here F,, Fr, Fs 
are independent of time. We have the following 
H?-regularity of solutions (Ziane 1995, Hu et al. 
2002, Temam and Ziane 2004). 


Theorem 3 Assume that h is in C*(T;),b » b» 0, 
Fy, Fr; Fs € (L^(M. y? and Sv OPEP + sie SI = Ca Ta 
€ (HÌ(T;))*. Let (v, T, S; p) € (HHME)) x L?(T;) be 
a weak solution of [24]-| 26]. Then 


(v,p) € (HA(M.)) xH! (Me) 
(T, S) e (H?(M.))’ 


Moreover, the following inequalities hold: 


i) 


2 2 2 2 
[Tlam < C||Fr| + [gri + e|Varliae) 
2 
ISl) S < C[ESP 


[27] 


2 2 
lee.) + &ip[rm (T;) 


< C||Ful? + |gvlr 


1p.) +E LAN 


where C is a positive constant independent of £. 
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We now turn our attention to the nonlinear time- 
dependent PEs. The local-in-time existence and 
uniqueness of strong solutions is obtained in 
Temam and Ziane (2004); see also Hu et al. 
(2003) and Guillén-Gonzalez et al. (2001). The 
proof is more involved than that of the three- 
dimensional Navier-Stokes equations. It consists of 
several steps. In the first step, one proves the global 
existence of strong solutions to the linearized time- 
dependent problem. In the second step, one uses the 
solution of the linearized equation in order to reduce 
the PEs to a nonlinear evolution equation with zero 
initial data and homogeneous boundary conditions. 
Finally, in the last step, one uses nonisotropic 
Sobolev inequalities together with Theorem 3. The 
local existence result is given by the following: 


Theorem 4 Let £ >Q be given. We assume that T; 
is of class C? and that h:T;-+R. is of class C. We 
are given Uo in V, F=(Fy, Fr, Fs) in L^(0, ti; H) with 
OF/Ot in 17(0,t1;L?(M.)"), and g=(gv,gr) in 
L?(0, tı; Hi(Ti) with Ag/At in L?(0,t);H}(T;)°). 
Then there exists t, > 0,t, —t,(|Uo|), and there 
exists a unique solution U = U(t) — (v(t), T(t), S(t)) 
of the PEs [10]-[17], [18]-[20], and [22] such that 


U € C([0, t]; V) n L?(0,2,, H7(M-)*) [28] 


The PEs of the Atmosphere 


In this section we briefly describe the PEs of the 
atmosphere, for which all the mathematical results 
obtained for the PEs of the ocean are valid. We start 
from the conservations equations similar to [1]|-[5]; in 
fact [1] and [2] are the same; the equation of energy 
conservation (temperature) is slightly different from 
[3] because of the compressibility of air; the state 
equation is that of perfect gas instead of [5]; finally, 
instead of the concentration of salt in the water, we 
consider the amount of water in air, q. Hence, we have 


pV3 + 2p x Vs + Vap + pg = D [29] 
dp 
r^ + pdiv3V3 — 0 [30] 
dT RTdp o 
h dt p dt [31] 
4 p = RpT 


Here cp > 0 is the specific heat of air at constant 
pressure, and R is the specific gas constant for the 
air. Proceeding as in the PEs of the ocean, we 
decompose V; into its horizontal and vertical 
components, V3 —v-- w; then we use the hydro- 
static approximation, replacing the equation of 
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conservation of vertical momentum by the hydro- 
static equation [9]. We find 


OU a xp p- ast iat 
Ot 1 0 V po 
+20 sin @ x v — py Av — v, OF [32] 
Hv “a E 
op — 
g 5 [33] 
OT OT 
OL 一 一 十 Vy of hee pr Or 
OT RT dp 
Tog p dt = Qr 4] 
ôq = "c 
p = RpT [36] 


The right-hand side of [34], represents the solar 
heating. 


Change of Vertical Coordinate 


Since p does not vanish, the hydrostatic equation 
[33] implies that p is a strictly decreasing function of 
z, and we are thus allowed to use p as the vertical 
coordinate; hence in spherical geometry the inde- 
pendent variables are now 1,6, p, and t. By an abuse 
of notation, we still denote by v, p, T, q, p these 
functions expressed in the y, 0, p, t variables. We 
denote by w the vertical component of the wind in 
the new variables, and one can show that the PEs of 
the atmosphere become 


+20sn0kxv+VO-—Lyw=F, [37] 
Ob R 


—+--—T=0 38 
Op p [38] 
divo Wa = 0 39) 
OT or RF 

Tio a edad: M 
Or 一 十 Vy Tu o, p w — LT T | 0] 

Og Og Hu 
apt Ved tw, cid = Fe [41] 
p = RpT [42] 


We have denoted by ®=gz the geopotential (z is 
now function of p, 0, p, t); Lv, LT, Lg are the Laplace 


operators, with suitable eddy viscosity coefficients, 
expressed in the v, 0, p variables. Hence, for example, 


Ep 
Lyv = u, Av + v, — 77 (g =) E! |43] 


with similar expressions for Lr and Lg. Note that 
Fr corresponds to the heating of the Sun, whereas F, 
and F, (which vanish in reality) are added here for 
mathematical generality. The change of variable gives, 
for 0^v/Oz? , a term different from the coefficient of vy. 
The expression above is simplified for of this coeffi- 
cient; the simplification is legitimate because v, is a very 
small coefficient (in particular, T has been replaced by 
T (known) average value of the temperature). 


Pseudogeometrical Domain 


For physical and mathematical reasons, we do not allow 
the pressure to go to zero, and assume that p > po, with 
po > 0 “small.” Physically, in the very high atmosphere 
(p very small), the air is ionized and the equations above 
are not valid anymore. The pressure is then restricted to 
an interval po < p < pı, where pı is a value of the 
pressure smaller in average than the pressure on Earth, 
so that the isobar p = p is slightly above the Earth and 
the isobar p = po is an isobar high in the sky. We study 
the motion of the air between these two isobars. 

For the whole atmosphere, the boundary of this 
domain 


M = {(p,0,p),po < p < pi} 


consists first of an upper part [,,p=po; the lower 
part p=p1 is divided into two parts Ti the part of 
p=p1 at the interface with the ocean, and T, the 
part of p =p: above the earth. 


Boundary Conditions 


Typically, the boundary conditions are as follows: 


On the top of the atmosphere Talp — po) 


Ov _ 7 OT 7 q i 
ap =o 9-70. 5753 [44] 
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Definitions and an Example 


A gerbe can be viewed as a next step in a ladder of 
geometric and topological objects on a manifold 
which starts from ordinary complex-valued func- 
tions and in the second step of sections of complex 
line bundles. 

It is useful to recall the construction of complex 
line bundles and their connections. Let M be a 
smooth manifold and {U,} an of open cover of M 
which trivializes a line bundle L over M. Topologi- 
cally, up to equivalence, the line bundle is comple- 
tely determined by its Chern class, which is a 
cohomology class [c] € H^(M, Z). On each open 
set Ua we may write 2zic—dA,, where A, is a 
1-form. On the overlaps Ugg = Ua N Ug we can write 


Aa — As = fag fas |j 


at least when Uag is contractible, where fy, is a 
circle-valued complex function on the overlap. The 
data (c, Aq, fag} define what is known as a (repre- 
sentative of a) Deligne cohomology class on the 
open cover {U,}. The 1-forms A, are the local 
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potentials of the curvature form 2zic and the fag’s 
are the transition functions of the line bundle L. 
Each of these three different data defines separately 
the equivalence class of the line bundle but together 
they define the line bundle with a connection. 

The essential thing here is that there is a bijection 
between the second integral cohomology of M and 
the set of equivalence classes of complex line bundles 
over M. It is natural to ask whether there is a 
geometric realization of integral third (or higher) 
cohomology. In fact, gerbes provide such a realiza- 
tion. Here, we shall restrict to a smooth differential 
geometric approach which by no means is the most 
general possible, but it is sufficient for most applica- 
tions to quantum field theory. However, there are 
examples of gerbes over orbifolds that do not need to 
come from finite group action on a manifold, which 
are not covered by the following definition. 

For the examples in this article, it is sufficient to 
adapt the following definition. A gerbe over a 
manifold M (without geometry) is simply a 
principal bundle 7:P—M with fiber equal to 
PU(H), the projective unitary group of a Hilbert 
space H. The Hilbert space may be either finite or 
infinite dimensional. 

The quantum field theory applications discussed 
in this article are related to the chiral anomaly for 
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fermions in external fields. The link comes from 
the fact that the chiral symmetry breaking leads in 
the generic case to projective representations of the 
symmetry groups. For this reason, when modding 
out by the gauge or diffeomorphism symmetries, 
one is led to study bundles of projective Hilbert 
spaces. The anomaly is reflected as a nontrivial 
characteristic class of the projective bundle, 
known in mathematics literature as the Dixmier- 
Douady class. 

In a suitable open cover, the bundle P has a family 
of local trivializations with transition functions 
£ag : Ugg — PU(H), with the usual cocycle property 


goafgpygya = 1 [2] 


on triple overlaps. Assuming that the overlaps are 
contractible, we can choose lifts ĝas: Uag — U(H), 
to the unitary group of the Hilbert space. However, 


80583780 a f aßy [3] 


where the f's are circle-valued functions on triple 
overlaps. They satisfy automatically the cocycle 


property 
fasil, m A Bs =1 [4] 


on quadruple overlaps. There is an important differ- 
ence between the finite- and infinite-dimensional 
cases. In the finite-dimensional case, the circle 
bundle U(H)— U(H)/S!=PU(H) reduces to a 
bundle with fiber Z/NZ=Zyn, where N =dimH. 
This follows from U(N)/S! — SU(N)/Zw and the fact 
that SU(N) is a subgroup of U(N). For this reason 
one can choose the lifts ĝag such that the functions 
fay take values in the finite subgroup Zy C S!. 

The functions fag, define an element a = (a,5,;] in 
the Cech cohomology H?(M,Z) by a choice of 
logarithms, 


27i45545 = log fagy — log fags + log fays — log foye [5] 


In the finite-dimensional case, the Cech cocycle is 
necessarily torsion, Na=0, but not so if H is infinite 
dimensional. In the finite-dimensional case (by passing 
to a good cover and using the Cech — de Rham 
equivalence over real or complex numbers), the class is 
third de Rham cohomology constructed from the 
transition functions is necessarily zero. Thus, in 
general one has to work with Cech cohomology to 
preserve torsion information. One can prove: 


Theorem The construction above is a one-to-one 
map between the set of equivalence classes of PU(H) 
bundles over M and elements of H?(M, Z). 


The characteristic class in H?(M,Z) of a PU(H) 
bundle is called the Dixmier-Douady class. 


First example 


Let M be an oriented Riemannian manifold and FM 
its bundle of oriented orthonormal frames. The 
structure group of FM is the rotation group SO(n) 
with n — dimM. The spin bundle (when it exists) is a 
double covering Spin(M) of FM, with structure 
group Spin(z), a double cover of SO(z). Even when 
the spin bundle does not exist there is always the 
bundle Cl(M) of Clifford algebras over M. The fiber 
at x € M is the Clifford algebra defined by the 
metric gx, that is, it is the complex 2"-dimensional 
algebra generated by the tangent vectors v € T, (rn) 
with the defining relations 


y(u)y(v) 4-y(v)y(u) = 2gx(u, v) 


The Clifford algebra has a faithful representation in 
N — 2"/! dimensions ([x] is the integral part of x) 
such that 


yla - u) = S(a)y(u)S(a) ' 


where S$ is an unitary representation of Spin(z) in 
CN. Since Spin(z) is a double cover of SO(z), the 
representation § may be viewed as a projective 
representation of SO(n). Thus again, if the overlaps 
Uag are contractible, we may choose a lift of the 
frame bundle transition functions g,g to unitaries 
fog in H — CP, In this case, the functions fag, reduce 
to Z»5-valued functions, and the obstruction to the 
lifting problem, which is the same as the obstruction 
to the existence of spin structure, is an element of 
H?(M,Z2), known as the second Stiefel- Whitney 
class w2. The image of w with respect to the 
Bockstein map (in this case, given by the formula 
[5]) gives a 2-torsion element in H*(M,7Z), the 
Dixmier-Douady class. 

Another way to think of a gerbe is the following 
(we shall see that this arises in a natural way in 
quantum field theory). There is a canonical complex 
line bundle L over PU(H), the associated line bundle 
to the circle bundle $! — U(H) — PU(H). Pulling 
back L by the local transition functions 
Eas > PU(H), we obtain a family of line bundles 
Lag over the open sets Uag. By the cocycle property 
[2] we have natural isomorphisms 


Log © Ls, = Ln [6] 


We can take this as a definition of a gerbe over M: 
a collection of line bundles over intersections of 
open sets in an open cover of M, satisfying the 
cocycle condition [6]. By [6] we have a trivialization 


La & L5, 多 Las = Jafir’ 1 [7] 


where the f’s are circle-valued functions on the 
triple overlaps. By the theorem above, we conclude 


that indeed the data in [6] define (an equivalence 
class of) a principal PU(H) bundle. 

If Lag and L,, are two systems of local line 
bundles over the same cover, then the gerbes are 
equivalent if there is a system of line bundles L, 
over open sets U, such that 


Lig = Lag & LL & Lg [8] 


on each Uag. 

A gerbe may come equipped with geometry, 
encoded in a Deligne cohomology class with respect 
to a given open covering of M. The Deligne class is 
given by functions fagy, 1-forms Aag, 2-forms Fa, 
and a global 3-form (the Dixmier-Douady class of 
the gerbe) Q, subject to the conditions 


dF, = Dams) 
Fs i Fg — dA,5 [9] 
Aag o Aay T Apy "- fo. das 


Gerbes from Canonical Quantization 


Let D, be a family of self-adjoint Fredholm operators 
in a complex Hilbert space H parametrized by x € M. 
This situation arises in quantum field theory, for 
example, when M is some space of external fields, 
coupled to Dirac operator D on a compact manifold. 
The space M might consist of gauge potentials 
(modulo gauge transformations) or M might be the 
moduli space of Riemann metrics. In these examples, 
the essential spectrum of D, is both positive and 
negative and the family D, defines an element of 
K! (M). In fact, one of the definitions of K'(M) is that 
its elements are homotopy classes of maps from M to 
the space F, of self-adjoint Fredholm operators with 
both positive and negative essential spectrum. In 
physics applications, one deals most often with 
unbounded Hamiltonians, and the operator norm 
topology must be replaced by something else; popular 
choices are the Riesz topology defined by the map 
F= F/(|F| +1) to bounded operators or the gap 
topology defined by graph metric. 

The space F, is homotopy equivalent to the 
group G = U4(H) of unitary operators g in H such 
that g — 1 is a trace-class operator. This space is a 
classifying space for principal U,.; bundles, where 
Utes is the group of unitary operators g in a polarized 
complex Hilbert space 71 — 4. (p H- such that the 
off-diagonal blocks of g are Hilbert-Schmidt opera- 
tors. This is related to Bott periodicity. There is a 
natural principal bundle P over G — U4(H) with fiber 
equal to the group QG of based loops in G. The total 
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space P consists of smooth paths f(t) in G starting 
from the neutral element such that f !df is smooth 
and periodic. The projection P — G is the evaluation 
at the end point f(1). The fiber is clearly QG. By Bott 
periodicity, the homotopy groups of QG are shifted 
from those of G by one dimension, that is, 


T22G = mT441G 


The latter are zero in even dimensions and equal to Z 
in odd dimensions. On the other hand, it is known that 
the even homotopy groups of Ures(H) are equal to Z 
and the odd ones vanish. In fact, with a little more 
effort, one can show that the embedding of OG 
to U,4(H) is a homotopy equivalence, when H = 
L^(S!, H), the polarization being the splitting to non- 
negative and negative Fourier modes and the action of 
QOG is the pointwise multiplication on H-valued 
functions on the circle S!. 

Since P is contractible, it is indeed the classifying 
bundle for Ues, bundles. Thus, we conclude that 
“K'(M)=the set of homotopy classes of maps 
M-G-the set of equivalence classes of Us 
bundles over M." The relevance of this fact in 
quantum field theory follows from the properties of 
representations of the algebra of canonical anti- 
commutation relations (CAR). For any complex 
Hilbert space H, this algebra is the algebra gener- 
ated by elements a(v) and a*(v), with v € H, subject 
to the relations 


a' (u)a(v) - a(v)a' (u) = 2 < v,u > 


where the Hilbert space inner product on the right- 
hand side is antilinear in the first argument, and all 
other anticommutators vanish. In addition, a*(u) is 
linear and a(v) antilinear in its argument. 

An irreducible Dirac representation of the CAR 
algebra is given by a polarization H=H,@H_. 
The representation is characterized by the existence 
of a vacuum vector w in the fermionic Fock space F 
such that 


a'(uy)p —-0-—a(vv forueH.,veEH,  [10| 


A theorem of D Shale and W F Stinespring says that 
two Dirac representations defined by a pair of 
polarizations H4, H' are equivalent if and only if 
there is g € Ures(H, & H_) such that H', =g - H,. In 
addition, in order that a unitary transformation g is 
implementable in the Fock space, that is, there is a 
unitary operator g in F such that 


&a'(v)& ! =a"(gv), Vv€H [11] 


and similarly for the a(v)'s, one must have g € Ures 
with respect to the polarization defining the vacuum 
vector. This condition is both necessary and sufficient. 
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The polarization of the one-particle Hilbert space 
comes normally from a spectral projection onto the 
positive-energy subspace of a Hamilton operator. In 
the background field problems one studies families 
of Hamilton operators D, and then one would like 
to construct a family of fermionic Fock spaces 
parametrized by x € M. If none of the Hamilton 
operators has zero modes, this is unproblematic. 
However, the presence of zero modes makes it 
impossible to define the positive-energy subspace 
H , (x) as a continuous function of x. One way out of 
this is to weaken the condition for the polarization: 
each x € M defines a Grassmann manifold Gr, (x) 
consisting of all subspaces W C H such that the 
projections onto W and H,(x) differ by Hilbert- 
Schmidt operators. The definition of Gr,4(x) is 
stable with respect to finite-rank perturbations of 
D, /|D,|. For example, when D, is a Dirac operator 
on a compact manifold then (D, — A)/|D, — A| 
defines the same Grassmannian for all real numbers 
A because in each finite interval there are only a 
finite number of eigenvalues (with multiplicities) of 
D,. From this follows that the Grassmannians form 
a locally trivial fiber bundle Gr over families of 
Dirac operators. 

If the bundle Gr has a global section x — W, then 
we can define a bundle of Fock space representa- 
tions for the CAR algebra over the parameter space M. 
However, there are important situations when no 
global sections exist. It is easier to explain the 
potential obstruction in terms of a principal Ures 
bundle P such that Gr is an associated bundle to P. 

The fiber of P at x € M is the set of all unitaries g 
in H such that g - H} € Gr, where H = H, 6 H- isa 
fixed reference polarization. Then we have 


Gr —P XU Ote. 


res 


where the right action of Ures = Uyes(H4  H ) in 
the fibers of P is the right multiplication on unitary 
operators and the left action on Grres comes from 
the observation that Gries; = Ures/(U, x U_), where 
Us are the diagonal block matrices in Ures. By a 
result of N Kuiper, the subgroup U- x U_ is 
contractible and so Gr has a global section if and 
only if P is trivial. 

Thus, when P is trivial we can define the family of 
Dirac representations of the CAR algebra parame- 
trized by M such that in each of the Fock spaces we 
have a Dirac vacuum which, in a precise sense, is close 
to the vacuum defined by the energy polarization. 
However, the triviality of P is not a necessary 
condition. Actually, what is needed is that P has a 
prolongation to a bundle P with fiber U,.;. The group 
Ues is a central extension of U,4 by the group S!. 


The Lie algebra ftes is as a vector space the direct 
sum tres DIR, with commutators 


[X + A, Y + u) = [X, Y] + c(X, Y) [12] 
where c is the Lie algebra cocycle 
c(X, Y) = itre[e, X][e, Y] [13] 


Here c is the grading operator with eigenvalues +1 
on H+. The trace exists since the off diagonal blocks 
of X, Y are Hilbert-Schmidt. 

The group U, is a circle bundle over Ures. The 
Chern class of the associated complex line bundle is 
the generator of H? (Ues, Z) and is given explicitly at 
the identity element as the antisymmetric bilinear 
form c/27i and at other points on the group 
manifold through left-translation of c/27i. If P is 
trivial, then it has an obvious prolongation to the 
trivial bundle M x Ures. In any case, if the prolonga- 
tion exists we can define the bundle of Fock spaces 
carrying CAR representations as the associated 


bundle 
F=P ug Fo 


where is Fo is the fixed Fock space defined by the 
same polarization H =H, & H . used to define Ures. 
By the Shale-Stinespring theorem, any g € Ures has 
an implementation g in Fo, but g is only defined up 
to phase, thus the central $! extension. 

The action of the CAR algebra in the fibers is 
given as follows. For x € M choose any £g € P. 
Define 


a' (v) - (8, v) = (&a'(g v)v) 


where w € Fo and v € H; similarly for the operators 
a(v). It is easy to check that this definition passes 
to the equivalence classes in 7. Note that the 
representations in different fibers are in general 
inequivalent because the tranformation g is not 
implementable in the Fock space Fo. 

The potential obstruction to the existence of the 
prolongation of P is again a 3-cohomology class on 
the base. Choose a good cover of M. On the 
intersections U,4 of the open cover the transition 
functions gas of P can be prolonged to functions 
£a8: Uag — Ures. We have 


Za8£5483a a fab; d. [14] 


for functions fagy : Uagy — S', which by construction 
satisfy the cocycle property [4]. Since the cocycle is 
defined on a good cover, it defines an integral Cech 
cohomology class w € H?(M, Z). 

Let us return to the universal U,,, bundle P over 
G = U4(H). In this case the prolongation obstruction 
can be computed relatively easily. It turns out that 


the 3-cohomology class is represented by the de 
Rham class which is the generator of H*(G,7Z). 
Explicitly, 


EC ws 


Any principal U,., bundle over M comes from a 
pullback of P with respect to a map f: M — G, so 
the Dixmier-Douady class in the general case is the 
pullback f*w. 

The line bundle construction of the gerbe over 
the parameter space M for Dirac operators is given 
by the observation that the spectral subspaces 
E,y(x) of Dx, corresponding to the open interval 
]|A,A[ in the real line, form finite-rank vector 
bundles over open sets Uyw = Ux N Uy. Here U, is 
the set of points x € M such that A does not belong 
to the spectrum of D,. Then we can define, as top 
exterior power, 


Ly = A™ (Ew) 


as the complex vector bundle over Ux. It follows 
immediately from the definition that the cocycle 
property [6] is satisfied. 


Example 1 (Fermions on an interval). Let K be a 
compact group and p its unitary representation in a 
finite-dimensional vector space V. Let H be the 
Hilbert space of square-integrable V-valued func- 
tions on the interval [0,27] of the real axis. For 
each g € K let Dom, C H be the dense subspace of 
smooth functions w with the boundary condition 
wW(27) = p(g)v(0). Denote by D, the operator 
—id/dx on this domain. The spectrum of D, is a 
function of the eigenvalues A, of p(g), consisting of 
real numbers n + log (A,)/2m1 with n € Z. For this 
reason the splitting of the one-particle space H to 
positive and negative modes of the operator D, is 
in general not continuous as function of the 
parameter g. This leads to the problems described 
above. However, the principal U,, bundle can be 
explicitly constructed. It is the pullback of the 
universal bundle P with respect to the map f : K —^ G 
defined by the embedding p(K) C G as N x N block 
matrices, N= dim V. Thus, the Dixmier-Douady 
class in this example is 


w= tr (plg) doe ^ [6 


Example 2  (Fermions on a circle). Let H = L?(S!, V) 
and D4 = —i(d/dx + A) where A is a smooth vector 
potential on the circle taking values in the Lie 
algebra k of K. In this case, the domain is fixed, 
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consisting of smooth V-valued functions on the 
circle. The k-valued function A is represented as a 
multiplication operator through the representation p 
of K. The parameter space .A of smooth vector 
potentials is flat; thus, there cannot be any obstruc- 
tion to the prolongation problem. However, in 
quantum field theory, one wants to pass to the 
moduli space .A/G of gauge potentials. Here G is the 
group of smooth based gauge transformations, that 
is, G— OK. Now the moduli space is the group of 
holonomies around the circle, A/G = K. Thus, we are 
in a similar situation as in Example 1. In fact, these 
examples are really two different realizations of the 
same family of self-adjoint Fredholm operators. 
The operator D4 with k = holonomy(A) has exactly 
the same spectrum as D, in Example 1. For this 
reason, the Dixmier-Douady class on K is the same as 
before. 


The case of Dirac operators on the circle is simple 
because all the energy polarizations for different 
vector potentials are elements in a single Hilbert- 
Schmidt Grassmannian Gr(H , & H_), where we can 
take as the reference polarization the splitting to 
positive and negative Fourier modes. Using this 
polarization, the bundle of fermionic Fock spaces 
over A can be trivialized as F =A x Fo. However, 
the action of the gauge group G on F acquires a 
central extension G C LK, where LK is the free loop 
group of K. The Lie algebra cocycle determining the 
central extension is 


«X, Y) = a / trpX dy 17 


where tr, is the trace in the representation p of K. 
Because of the central extension, the quotient F/G 
defines only a projective vector bundle over .A/G, 
the Dixmier-Douady class being given by [16]. 

In the Example 1 (and Example 2) above, the 
complex line. bundles can be constructed quite 
explicitly. Let us study the case K — SU(z). Define 
U, C K as the set of matrices g such that A is not an 
eigenvalue of g. Select n different points A; on the 
unit circle such that their product is not equal to 1. 
We assume that the points are ordered counter- 
clockwise on the circle. Then the sets U; = Uy, form 
an open cover of SU(z). On each U; we can choose a 
continuous branch of the logarithmic function 
log : U; + su(n). The spectrum of the Dirac operator 
D, with the holonomy g consists of the infinite set of 
numbers Z + Spec(—ilog(g)). In particular, the 
numbers Z — ilog A; do not belong to the spectrum 
of D. Choosing p= —ilogA, as an increasing 
sequence in the interval [0,27], we can as well 
define. U; = (x € M|u; € Spec(D,)). In any case, the 


544 Gerbes in Quantum Field Theory 


top exterior power of the spectral subspace Ep, (x) 
is given by zero Fourier modes consisting of the 
spectral subspace of the holonomy g in the segment 
[Aj, Àg] of the unit circle. 


Index Theory and Gerbes 


Gauge and gravitational anomalies in quantum field 
theory can be computed by Atiyah-Singer index 
theory. The basic setup is as follows. On a compact 
even-dimensional spin manifold $ (without bound- 
ary) the Dirac operators coupled to vector potentials 
and metrics form a family of Fredholm operators. 
The parameter space is the set .A of smooth vector 
potentials (gauge connections) in a vector bundle 
over S and the set of smooth Riemann metrics on $. 
The family of Dirac operators is covariant with 
respect to gauge transformations and diffeomorph- 
isms of $; thus, we may view the Dirac operators 
parametrized by the moduli space A/G of gauge 
connections and the moduli space M/Diffo(S) of 
Riemann metrics. Again, in order that the moduli 
spaces are smooth manifolds, one has to restrict to 
the based gauge transformations, that is, those 
which are equal to the neutral element in a fixed 
base point in each connected component of S. 
Similarly, the Jacobian of a diffeomorphism is 
required to be equal to the identity matrix at the 
base points. Passing to the quotient modulo gauge 
transformations and diffeomorphims, we obtain a 
vector bundle over the space 


S x A/G x M/Diffo(S) [18] 


Actually, we could as well consider a generalization 
in which the base space is a fibering over the moduli 
space with model fiber equal to S, but for simplicity 
we stick to [18]. 

According to the Atiyah-Singer index formula for 
families, the K-theory class of the family of Dirac 
operators acting on the smooth sections of the tensor 
product of the spin bundle and the vector bundle V 
over [18] is given through the differential forms 


^ 


A(R) ^ ch(V) 


where A(R) is the A-roof genus, a function of the 
Riemann curvature tensor R associated with the 
Riemann metric, 


A(R) = de^ ( NE ) 


sinh(R/471) 
and ch(V) is the Chern character 


ch(V) = tr e?" 


where F is the curvature tensor of a gauge connec- 
tion. Here both R and F are forms on the infinite- 
dimensional base space [18]. After integrating over 


the fiber S, 
ind = J A(R) A ch(V) 19 
S 


we obtain a family of differential forms $54, one in 
each even dimension, on the moduli space. 

The (cohomology classes of) forms $5, contain 
important topological information for the quantized 
Yang-Mills theory and for quantum gravity. The 
form œz describes potential chiral anomalies. The 
chiral anomaly is a manifestation of gauge or 
reparametrization symmetry breaking. If the class 
[95] is nonzero, the quantum effective action cannot 
be viewed as a function on the moduli space. 
Instead, it becomes a section of a complex line 
bundle DET over the moduli space. 

Since the Dirac operators are Fredholm (on 
compact manifolds), at a given point in the moduli 
space we can define the complex line 


DET, = A (ker D2) @ A" (cokerD?) — [20] 


for the chiral Dirac operators D}. In the even- 
dimensional case, the spin bundle is Z5 graded such 
that the grading operator F anticommutes with Dx. 
Then, D; 2P.D,.P,, where P,-—(1/2)(1:ET) are 
the chiral projections. A‘? means the operation on 
finite-dimensional vector spaces W taking the 
exterior power of W to dim W. 

When the dimensions of the kernel and cokernel 
of D, are constant, eqn [20] defines a smooth 
complex line bundle over the moduli space. In the 
case of varying dimensions, a little extra work is 
needed to define the smooth structure. 

The form ¢2 is the Chern class of DET. So if DET 
is nontrivial, gauge covariant quantization of the 
family of Dirac operators is not possible. 

One can also give a geometric and topological 
meaning to the chiral symmetry breaking in Hamil- 
tonian quantization, and this leads us back to gerbes 
on the moduli space. Here we have to use an odd 
version of the index formula [19]. Assuming that the 
physical spacetime is even dimensional, at a fixed 
time the space is an odd-dimensional manifold S. 
We still assume that $ is compact. In this case, the 
integration in [19] is over odd-dimensional fibers 
and, therefore, the formula produces a sequence of 
odd forms on the moduli space. 

The first of the odd forms $4 gives the spectral 
flow of a one-parameter family of operators D,4,. 
Its integral along the path x(t), after a correction by 
the difference of the eta invariant at the end points 


of the path, in the moduli space, gives twice the 
difference of positive eigenvalues crossing over to 
the negative side of the spectrum minus the flow of 
eigenvalues in the opposite direction. The second 
term $3 is the Dixmier-Douady class of the 
projective bundle of Fock spaces over the moduli 
space. In Examples 1 and 2, the index theory 
calculation gives exactly the form [16] on K. 


Example Consider Dirac operators on the three- 
dimensional sphere $? coupled to vector potentials. 
Any vector bundle on S? is trivial, so let V =S? x CN, 
Take SU(N) as the gauge group and let .A be the space 
of 1-forms on $? taking values in the Lie algebra su(N) 
of SU(N). Fix a point x, on $?, the “south pole,” and 
let G be the group of gauge transformations based at 
x; That is, G consists of smooth functions 
g: SU(N) with g(x,)— 1. In this case A/G can 
be identified as Map(S*,SU(N)) times a contractible 
space. This is because any point x on the equator of S? 
determines a unique semicircle from the south pole to 
the north pole through x. The parallel transport along 
this path with respect e a vector potential A € A 
defines an element g(x) € SU(N), using the fixed 
trivialization of V. Set g4(x) — g(x (x)g (xo), where 
xo is a fixed point on the equator. The element g4(x) 
then depends only on the gauge equivalence class [A] € 
A/G. It is not difficult to show that the map A — ga is 
a homotopy equivalence from the moduli space of 
gauge potentials to the group G2 = Map,, (S^, SU(N)), 
based at xo. When N 2, the cohomology 


H?(SU(N),Z)-—ZL transgreses to the cohomology 
H? (65, Z.) = Z. In particular, the generator 


AME E AF: 
iai 2n/ S! RULES 


of H?(SU(N), Z) gives the generator of H?(G5, Z) b 
contraction and integration, 


a- | W5 
SA. 


Gauge Group Extensions 


The new feature for gerbes associated with Dirac 
operators in higher than one dimension is that the 
gauge group, acting on the bundle of Fock spaces 
parametrized by vector potentials, is represented 
through an abelian extension. On the Lie algebra 
level this means that the Lie algebra extension is not 
given by a scalar cocycle c as in the one-dimensional 
case but by a cocycle taking values in an abelian Lie 
algebra. In the case of Dirac operators coupled to 
vector potentials, the abelian Lie algebra consists of 
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a certain class of complex functions on A. The 
extension is then defined by the commutators 


[(X.0),(Y,8)] = (IX, Y], £x8 — Lya +c(X,Y)) [21] 


where a, are functions on A and £x denotes the 
Lie derivative of 8 in the direction of the infinitesi- 
mal gauge transformation X. The 2-cocycle property 
of c is expressed as 


c([X, Y], Z) + £xc(Y, Z) 
+ cyclic permutations of X, Y, Z = 0 


In the case of Dirac operators on a 3-manifold S the 
form c is the Mickelsson-Faddeev cocycle 


«X, Y) = a] t,AA(dXAdY—dYAdX) [22] 


TM 1272 

The corresponding gauge group extension is an 
extension of Map(S,G) by the normal subgroup 
Map(A,S'). As a topological space, the extension is 
the product 


Map(A, S!) X şı P 


where P is a principal S! bundle over Map(S, G). 
The Chern class c; of the bundle P is again 
computed by transgression from ws; this time 


a [us 
M 


In fact, we can think of the cocycle c as a 2-form on 
the space of flat vector potentials A = g dg with g € 
Map(S?, G). Then one can show that the cohomol- 
ogy classes [c] and [ci] are equal. 

As we have seen, the central extension of a loop 
group is the key to understanding the quantum field 
theory gerbe. Here is a brief description of it starting 
from the 3-form [16] on a compact Lie group G. 
First define a central extension Map(D, G) x S! of 
the group of smooth maps from the unit disk D to 
G, with pointwise multiplication. The group multi- 
plication is given as 


(8; A) i (g', A) = 


(gg', AN ,eargs)) 


where 
1 j 
a(g g) = 3 ga Jm" dg ^ dg/g^ 一 [23] 


where the trace is computed in a fixed unitary 
representation p of G. This group contains as a 
normal subgroup the group N consisting of pairs 
(g, e278) with 


Cle) = uua tree dp? pA 
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Here g(x)=1 on the boundary circle $! = ðD, and 
thus can be viewed as a function $?— G. The 
three-dimensional unit ball B has S* as a boundary 
and g is extended in an arbitrary way from the 
boundary to the ball B. The extension is possible 
since 7(G)=0 for any finite-dimensional Lie 
group. The value of C(g) depends on the extension 
only modulo an integer and therefore e?"€& is 
well defined. 


The central extension is then defined as 
LG = (Map(D, G) x S')/N 


One can show easily that the Lie algebra of LG 
is indeed given through the cocycle [17]. 
When G=SU(m) in the defining representation, 
this central extension is the basic extension: 
The cohomology class is the generator of 
H?(LG,Z). In general, to obtain the basic exten- 
sion one has to correct [23] and [24] by a 
normalization factor. 

This construction generalizes to the higher loop 
groups Map(S,G) for compact odd-dimensional 
manifolds S. For example, in the case of a 
3-manifold, one starts from an extension of 
Map(D, G), where D is a 4-manifold with bound- 
ary S$. The extension is defined by a 2-cocycle y, 
but now for given g,g’ the cocycle y is a real- 
valued function of a point go € Map(S, G), which is 
a certain differential polynomial in the Maurer- 
Cartan 1-forms gj'dgo,g 'dg,g !dg. The normal 
subgroup N is defined in a similar way; now C(g) is 
the integral of the S-form ws over a 5-manifold 
B with boundary OB identified as D/-~, the 
equivalence shrinking the boundary of D to one 
point. This gives the extension only over the 
connected component of identity in Map(S, G), but 
it can be generalized to the whole group. For 
example, when $—5$? and G is simple, the con- 
nected components are labeled by elements of the 
third homotopy group 73G =Z. 

In some cases, the de Rham cohomology class of 
the extension vanishes but the extension still 
contains interesting torsion information. In quan- 
tum field theory this comes from Hamiltonian 


formulation of global anomalies. A typical example 
of this phenomenon is the Witten SU(2) anomaly in 
four spacetime dimensions. In the Hamiltonian 
formulation, we take $? as the physical space, the 
gauge group G —SU(2). In this case, the second 
cohomology of Map(S?, G) becomes pure torsion, 
related to the fact that the 5-form ws on SU(2) 
vanishes for dimensional reasons. Here the homo- 
topy group 74(G) — Z5 leads the nontrivial funda- 
mental group Z2 in each connected component of 
Map(S?, G). Using this fact, one can show that 


there is a nontrivial Z2 extension of the group 
Map(S?, G). 


See also: Anomalies; Bosons and Fermions in External 
Fields; Characteristic Classes; Dirac Operator and Dirac 
Field; Index Theorems; K-Theory. 
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Introduction 


In the Ginzburg-Landau theory of superconduc- 
tivity, a complex order parameter V characterizes 
a macroscopic/mesoscopic superconducting state 
in a bulk superconductor. The square of the 
magnitude | 亚 |” expresses the density of super- 
conducting electrons and W is regarded as a 
macroscopic wave function. With a magnetic 
vector potential A and the order parameter Y, 
the Helmholtz free energy density in a super- 
conducting material near the critical temperature 
is given by 


F =F, + o|V [^ + 


(-ibv » A A) uf + IHE 


C S7 


I uu 
二 | 也 
"d 
1 


* 2m; 


where F,, denotes the energy density of the normal 
state, c is the light speed, H — curl A, and ms and e; 
are mass and charge of a superconducting electron, 
respectively. The parameters a and ( depend on 
temperature and are determined by the material. 
Moreover, below the critical temperature Te, 
a=a(T) and 8= (T) take negative and positive 
values, respectively. In the presence of an applied 
magnetic field H,,, we have to consider the Gibbs 
free energy density, G =F — H - H35/47. 
Introduce the following physical parameters: 


Yo = J/—o/B, H: = 4/410? / B 
A = 4j —B8m;c^ /4xae?, E= -b /2ma [1] 


K= AG 


The value V2 implies the equilibrium density and H, 
is the thermodynamic critical field, which is 
obtained by equating G=F,, — [Has |^ / 8v (for the 
normal state V —0, H= Hap) with G=F, — o^/20 
(for the perfect superconductivity hdl = V, A =D]. 
The parameters A and € stand for penetration depth 
and coherence length, respectively. The ratio A of 
these characteristic lengths is called the Ginzburg- 
Landau parameter, which determines the type of 
superconducting material: type I for s < 1/2 and 
type II for x > 1/2. 
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We use the nondimensional variables x’, Y’, A’, 
Hap’, and G: 


X = Xx. y Vow’ 
A — V2H,£A! (H' = curl’ A’), 
Hap = V2H:Hap' /rk [2] 


F —-F,4- (G/k? — 1/2 
+ 2H' - Hap! /«? — |Hap P /&2)H2/4n 
Dropping the primes after the change of variables 
and integrating G over a domain Q C R"(n — 2, 3), 
which is occupied by a superconducting sample, 


yields a functional of V and A, called the Ginzburg- 
Landau energy in a nondimensional form, 


2 
E(w A) = f fv- - Py 
0 
+ |curl A. — d dx [3] 


The Ginzburg-Landau equations are the Euler- 
Lagrange equations of this energy, which are given 


by 


(V—iA)w-a(|w|-1) in Q [4] 

curl’ A = J + curl Hap in Q [5] 
where 

ic > vw — WV") — | 亚 | A [6] 


V* stands for the complex conjugate of V. In a two- 
dimensional domain Q, the differential operator 
“curl” acts on A = (A4, A;) : R? 一 R? such that 


curl A =0,,A2 — 0, A1 
curl H = (8x, H, —0,, H) 
H :=curl A 


and Hap is replaced by a scalar-valued function. 
Note that J represents a supercurrent in the 
material. Every critical point of the energy is 
obtained by solving the Ginzburg-Landau equations 
with appropriate boundary conditions and, thus, a 
physical state in the superconducting sample is 
realized by a solution of the equations. A minimizer 
of [3] is a solution of [4]-[5] that minimizes the 
energy [3] in an appropriate function space, whereas 
a local minimizer is a solution minimizing the energy 
locally in the space. A solution is called a stable 
solution if it is a local minimizer of the energy. 
A physically stable phenomenon could be realized 
by a minimizer or at least a local minimizer. 
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The Ginzburg—Landau energy and the equations 
are gauge invariant under the transformation 


(V, A) — (We, A + Vx) 7 


for a smooth scalar function x(x). Therefore, we can 
identify two solutions which have the correspon- 
dence through the transformation [7]. The following 
London (Coulomb) gauge is often chosen: 


dvA=0 im? [8] 
(with a boundary condition if necessary). 

Let (Y, A) be a smooth solution of [4]-[5]. In a region 
for |W(x)| >0, the expression Y =w(x) exp (10(x)) 
(w = |W(x)|) leads to 


V2w = |V0 — A Pw + r (u? — 1)w [9] 
div(w? (V0 — A)) = 0 [10] 
curl’ A = J = w?^(V0 — A) [11] 


where the gauge [8] is fixed and curl H,,=0 is 
assumed. Let S be a surface in Q bounded by a 
closed curve OS. Suppose w(x) > 0 on OS. Then 
from [11], 


p zl (J/w* + A) .ds 
Os 


=| 251 ds+ | curl A-ds 
as W S 


= | V0: ds = 2dr [12] 
JOS 


where d is an integer; in fact, d= deg(V, OS) is the 
winding number of W(OS) in the complex plane. 
Thus, the identity [12] relates the magnetic field to a 
topological degree of the order parameter. The 
quantity ®, multiplied by an appropriate constant, 
is called the fluxoid. A connected component of 
vanishing points of V generally has codimension 2 in 
the domain, and it is called a vortex. 

From the expression [9], the asymptotic behavior 
w —1 as &— oo is expected under a suitable 
condition. Then, by [11], H=curl A enjoys the 
property curl* H + curl H — 0, which is known as the 
London equation. However, this is valid for |V| > 0. 
Otherwise, a singularity appears around zeros of WV. 

There are several characteristic phenomena 
observed in a bulk superconductor. Typical phenom- 
ena are: perfect conductivity (persistent current), 
perfect diamagnetism (Meissner effect), nucleation 
of superconductivity, and vortices (quantization of a 
penetrating magnetic field). These phenomena can be 
expressed by solutions of the Ginzburg—Landau 
equations in various settings. 
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A standard model of the Ginzburg—Landau energy is 
considered in the whole space R*. Let A — (A1, A2) 
and assume H,, — 0 in [3]. Consider then the energy 
functional 


£(V, A) = f D4? + curl AP? 
JR? 


a 
EI dx [13 


where Da:=V — iA. Then the Ginzburg-Landau 
equations are 


DAW = (wp — 1)Y 
curl A = Im(V* DAWV) 


in R? [14] 
in R? [15] 


In the gauge theory, this model can be regarded as a 
two-dimensional abelian (U(1)) Higgs model. In that 
context, V is a scalar (Higgs) field, A is a connection 
on the U(1) bundle R? x U(1) and Da is the 
covariant derivative. 

Equations [14]-[15] are useful in observing quan- 
tization of the magnetic field, although it is an ideal 
model for superconductivity. By the natural condition 
that the right-hand side of [13] is finite, we may 
assume that |D4WV|,|curl A| — 0 and |V| — 1 as 
Ix| — oo. From [12], the flux quantization follows: 


| ct A di = Zu 16 
R2 


If v has a finite number of zeros (aja; [16] implies 


N 
Í curl A dx = 2r 》 ' deg( V, OB(a;, p)) 
R? ! 


j=l 


for a small positive number p, where Bl(a;, p) stands 
for the disk with the center a; and the radius p. 
A zero of V represents a vortex, at which the 
magnetic field is quantized, and a supercurrent 
moves around the field. 

To characterize the configuration analytically, we 
find a solution (V, A) expressed by the polar 
coordinate in the form 


V = f(r) exp(id0), A(r) = a(r)(— sin 6, cos 0) 


Substituting these into [14]-[15], one obtains 


(' — d/dr) with the boundary conditions 


f(0)—0, fí(o5)—1, a(eo)=—0 


This system of the equations has a solution for «x > 0. 
In addition to these types of solutions, when 
&—1/v2, a special transformation reduces the 
system of [14]-[15] to a scalar nonlinear equation 
with a singular term. Then, it is proved that for an 
arbitrary d € Z, under the constraint of [16] there 
exists a minimizer of [13] with zeros of prescribed 
points aj]. (Jaffe and Taubes 1980). 


Solutions for Persistent Current 


A current flowing in a superconducting ring with no 
decay even in the absence of an applied magnetic 
field is called a persistent current. Assume that a 
superconducting sample Q in R? is surrounded by 
vacuum and adopt the energy functional as 


2 
E(v,A) - | Dav? +E- (WP) ds 
0 E 
十 / curl Al? [17] 
R? 


Although the functional [17] is minimized by a 
trivial solution (V, A) = (exp (1c), 0)(c € R), which is 
the case for perfect diamagnetism, this is not the 
solution describing a persistent current since J —0 
everywhere. We have to look for a nontrivial 
solution that locally minimizes the energy, that is, 
a local minimizer of [17]. To characterize a solution 
representing the persistent current, we define a 
mapping from 2 to S! C C by x € Q > W(x)/|W(x)| 
for a solution (V, A) of the corresponding 
Ginzburg-Landau equations to [17]. Consider a domain 
having infinitely many homotopy classes in the 
space of continuous functions C?(Q,$!) (e.g., a 
solid torus). If (,.A) is a local minimizer and 
V/|V| is not homotopic to a constant map of 
C?(Q,8S!), then it is a solution describing a 
persistent current. The existence of such a solution 
has been established mathematically for large « 
(Jimbo and Morita 1996, Rubinstein and Sternberg 
1996). 


Configuration of Solutions under an 
Applied Magnetic Field 


In the presence of an applied magnetic field, 
according to the magnitude of the field, a sample 
exhibits the transition from the superconducting 
state to the normal state and vice versa. This 
transition can be considered mathematically as a 
bifurcation of solutions to the Ginzburg-Landau 
equations with a parameter measuring the magni- 
tude of the applied magnetic field. In fact, let Hap be 
an applied magnetic field perpendicular to the 
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horizontal plane and assume that it is constant 
along the vertical axis, that is, Hap = (0,0, Ha). 
Then a rich bifurcation structure is suggested by 
numerical and analytical studies in the parameter 
space of (H,,«). Mathematical developments for 
variational methods and nonlinear analysis reveal 
the configuration of the solutions and provide 
rigorous estimates for critical fields in a parameter 
regime for a two-dimensional model, predicted by 
physicists. 

~ Throughout this section, we consider the Ginzburg- 
Landau model in an infinite cylinder 0— D x R 
(D c R?) with a constant applied magnetic field 
Hap = Hae3= (0,0, Ha), Ha > 0. Assuming the uni- 
formity along the vertical axis, we may write 
A = (A1,A2) and H—curl A 20,45 —0,A, as in 
the previous section. Then the Ginzburg-Landau 
energy on D is 


. k2 
ewa) = [ fpa a - wy 
JD 
+ |curl A — Hal? bas [18] 


With the London gauge 
divA —0 in D, A-n=0 on oD 


the Ginzburg-Landau equations in the present 
setting are written as 


Div -(|w-—1)w» in D [19] 
-V^A —Im(W'D4W) in D [20] 
n-VU=0 on OD [21] 
curl A =H, on OD [22] 


where n denotes the outer unit normal. 


Meissner Solutions 


As seen in the case of no applied magnetic field, the 
trivial solution (V, A) = (exp(ic), 0) is a minimizer of 
[18]. This solution expresses no magnetic field in the 
sample. In a superconducting sample, the diamag- 
netism holds even in the presence of an applied 
magnetic field if the field is weak. Namely, the 
sample is shielded so that penetration of the field is 
only allowed near the surface of the sample. This 
phenomenon is called the Meissner effect. A solution 
expressing Meissner effect is called a Meissner 
solution. Mathematically, it is understood that as 
H, increases, such a Meissner solution continues 
from the trivial solution. Then the solution preserves 
the configuration 0 < |W(x)| <1. A study of the 
asymptotic behavior of the Meissner solution as «K 
tends to oo shows that the Meissner solution is a 
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minimizer up to H, = O( log «) for sufficiently large 
k (Serfaty 1999). 


Nucleation of Superconductivity 


In an experiment, the Meissner state breaks down by 
a stronger applied magnetic field. Then the sample 
turns to be the normal state (in a type I conductor) 
or it allows a mixed state of superconductivity and 
normal state (in a type II conductor). In the former 
case, the critical magnitude of the field is denoted by 
H., which corresponds to the one of [1], while it is 
denoted by H,, in the latter case. Moreover, the 
mixed state eventually breaks down to be normal 
state by further increasing the applied field up to 
another critical field Ha. To characterize these 
two types mathematically, we consider a transition 
from the normal state to the superconducting state 
by reducing the magnitude of the field. 

Let Aap satisfy curl Aap = Ha (x € D) and Aap- n= 
O(x € OD). Then eqns [19]-[22] have a trivial 
solution (Y, A)=(0, Aap), which stands for the 
normal state. Consider the secọnd variation of the 
energy functional [18] at this trivial solution 

1 d^ 


zga ElV Aap + sB) 


=} (V = iAap) wl 
s—0 D 
— «2 |b]? + [curl B|^dx 


If the minimum of this second variation for nonzero 
(v,B) is positive (or negative), then the trivial 
solution is stable (or unstable). The minimum gives 
the least eigenvalue of the linearized problem of 
[19]-[20] around the trivial solution. Seeking such a 
least eigenvalue u is reduced to studying an 
eigenvalue problem of the Schródinger operator 
L[y]:— —(V — iAsp)^v. 

If the domain D is the whole space R?, it is 
proved that u= H4. Back to the original variable of 
[2], we can define a critical field He = V2H.x; 
k — 1/V/2 separates a class of superconductors into 
type I by « <1/V2 (Ha < He) and type II by 
> 1/2 (EL, > Ae). 

In the bounded domain D, however, the critical 
field at which superconductivity nucleates in the 
interior of a sample is larger than He, (it is denoted 
by He), since the eigenvalue problem of L is 
considered in the domain with the Neumann 
boundary condition. A study of the least eigenvalue 
u shows that the critical field has the asymptotics as 


H.,/V2H, = 3 4+ O(1), 
where 0 < 8 < 1. If the applied field is very close to 
He and « is sufficiently large, the amplitude of the 
eigenfunction associated with the least eigenvalue of 


K, — OO 


L (with the Neumann boundary condition) is very 
small except for a 1/« neighborhood of the 
boundary. This implies that the nucleation of super- 
conductivity takes place at the boundary. This 
phenomenon is called surface nucleation (Del Pino 
et al. 2000, Lu and Pan 1999). 


Solutions of Vortices 


In a type II superconductor, it is well known that 
there exists a mixed state of superconductivity and 
normal state in a parameter regime Ha < Ha < He. 
In the mixed state, the magnetic field penetrating in 
the sample is quantized such that it delivers a finite 
number of lines or curves in the sample. This 
configuration (called vortex) is characterized by 
zero sets of the order parameter of the Ginzburg- 
Landau equations. In a two-dimensional domain, 
isolating vanishing points of the order parameter are 
called vortices. Thus, it is quite an interesting 
problem how such a vortex configuration can be 
described mathematically by a minimizer of the 
energy functional. In the section *Ginzburg-Landau 
equations in R?," a specific configuration for vortex 
solutions is stated under very special conditions, 
&—1/v2, on the whole space and no applied 
magnetic field. However, this result is not general- 
ized in the present setting. 

A standard approach to a solution with the vortex 
configuration is using a bifurcation analysis near the 
critical field He, (or He) by expanding a solution and 
the difference Ha — He, in a small parameter. Then the 
leading term is given by an eigenfunction of the least 
eigenvalue of the Schródinger operator coming from the 
linearization. Under the doubly periodic conditions in 
the whole space R^, the spatial pattern of vortices, called 
Abrikosov's vortex lattice, is studied by a local bifurca- 
tion theory. 

However, this kind of bifurcation analysis only 
works near the critical field and the trivial solution 
(V, A) —(0, Aap), which implies that only a small- 
amplitude solution can be found. To realize a sharp 
configuration of vortices, we need to consider a 
parameter regime far from the bifurcation point. As 
a matter of fact, mathematical and numerical studies 
for sufficiently large « exhibit nice configurations of 
vortex solutions. In this case, in a neighborhood of 
each vortex, with radius O(1/k), a sharp layer 
arises, and there exists a solution with multivortices 
in an appropriate parameter region for H,. In 
addition, as H4 increases (up to He), the number 
of vortices also increases. This implies that the 
minimizer of the energy functional [18] admits a 
larger number of zeros for a higher magnitude of 
applied magnetic field. However, it is a puzzle since 


a solution with a smaller number of vortices seems 
to have less energy. Thus, there is some balance 
mechanism between contributions of the vortices 
and the applied magnetic field to the total energy. 

Mathematically, it is possible to estimate 
E(V,A) for the vortex solution to [19]-[22] as 
follows: consider a family of square tiles K; with 
side-length p which are periodically arranged over 
the whole space. Assume each square in the 
domain D has a single vortex. For an appropriate 
test function, the energy over K; is estimated as 
O(log(&p)). Since the number of vortices in the 
domain is O([D|/p?) (|D|: the measure of D), we 
obtain an upper bound O((|D|/g^)log(«p)). This 
bound is less than £(0, Aap) —|D|&?/2 for H,/K* = 
o(1) and p— 1/4/H,. Although in a general case it is 
difficult to estimate the energy of the minimizer from 
below, the leading order can be precisely determined in 
some range of the interval (H.,, Ho) if « is sufficiently 
large (Sandier and Serfaty 2000). 


A Simplified Model 


Since the Ginzburg-Landau equations [4]-[5] are 
coupled equations for V and A, we often encounter 
mathematical difficulty in realizing a solution with 
the configuration shown by a numerical experiment. 
To look at a specific configuration, we may use a 
simpler model equation. A typical simplification is 
to neglect the magnetic field, which leads to the 
equation for the order parameter v: 


V^v--x4-|v^w-0 in Q [23] 


This equation is also called the Ginzburg-Landau 
equation and it is the Euler-Lagrange equation of 
the energy 


2 
GW = | WAZO- Py dx p4 


in an appropriate function space. Under no constraint, 
a constant solution with || — 1 is a minimizer. If a 
domain is topologically nontrivial, eqn [23] also 
allows local minimizers of [24] for large « as seen in 
the section “Solutions for Persistent Current.” 

On the other hand, [23] in a simply connected 
domain D C R^ with a boundary condition 
w=g(x)(x € OD) is used for a study of a vortex 
solution for large x. Let «= 1/4. Under the constraint 
deg(g, OD) — d, a minimizer w. must have at least |d] 
zeros. The leading order of the energy around each 
vortex is estimated as 27 log (1/e). The result of Bethuel 
et al. (1994) describes the energy for a minimizer 


G(pe) = 2|d| log(1/e) + y+ W(at,...,at4) + o(1) 
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where {a} are zeros of We and y is a universal 
constant. The function W is explicitly given as 


Wlai,...,4ala) = 27 ` 


1<j,k<|d|j#k 


log la; — ak| +R 


where R is derived from a Green function satisfying 
some boundary condition depending on g. More- 
over, as € — 0, the zeros converge to a minimizer of 
W, which implies that the asymptotic position of 
every zero (vortex) is determined by the explicit 
function W. The first term of W shows that vortices 
with the same sign of the degree are repulsive to one 
another and the optimal arrangement of vortices 
never allows the superposition of multivortices. 
Although the boundary condition is rather artificial, 
their mathematical formulation promoted the devel- 
opment of variational methods applied to the 
Ginzburg—Landau equation. 


Time-Dependent Ginzburg-Landau 
Equations 


The Ginzburg-Landau equations in the preceding 
sections are static models. We consider time evolu- 
tion models called the time-dependent Ginzburg- 
Landau equations. The evolution equations serve 
various numerical simulations exhibiting dynamical 
properties of solutions. They also provide mathe- 
matical problems on global time behaviors of 
solutions, stability of stationary solutions, dynami- 
cal laws of vortices, etc. The Ginzburg—Landau 
energy is denoted by £(u),u — (V, A). The simplest 
model for the time-dependent problem is the 
gradient flow for E(u) 


ĝu = —— 
' Ou 
where ó£/óu is the first variation of the energy. 
A more standard evolution equation in a nondimen- 
sional form is given by 


(8 +id)W —DAw —&^(1—|wi)w [25] 
nA + Và) + curl? A = Im(V*D4WV)-4 curl Hap [26] 


where (x, t) is the electric (scalar) potential and 7 is 
a positive parameter with a physical quantity. In 
fact, this equation was derived by Gor’kov and 
Eliashberg from the Bardeen, Cooper, and Schrieffer 
(BCS) theory. 

The system of the equations [25 |-[26] is invariant 
under the following time-dependent gauge 
transformation: 


(V; 6, A) — (V exp(ix), 6 — Ax, A+ Vx) 
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The equations in the bounded domain D C R% are 


considered subject to boundary and initial 
conditions 
DU- p=) on OQ x (0, T) 
curl A = H, on OQ x (0, T) 27 
V(x,0) — Vo in Q 
A(x,0) = Ao(x) in Q 


Then, besides the Coulomb gauge [8], we can 
choose the Lorentz gauge as follows: 


[eax =0 


For a smooth solution u(x,t) to [25]-[26] with [27], 


d 
gE) = -2 | 


holds if Hap is time independent. This is also true in 
the case of the whole space R° with a condition for 
the asymptotic behavior as |x| 一 oo. 

Suppose that a domain Q C R? is occupied by a 
superconducting sample and it is surrounded by a 
medium (or vacuum). Then the electromagnetic 
behavior in the outside domain, caused by the 
induced magnetic field of a supercurrent in €) and 
an applied magnetic field, should be expressed 
by the Maxwell equations. With the electric field 
E= —(u8;A + Và), we obtain 


div A+% —0 in D, 


A:-n=0 on OD 


(8, + id) W| + nl A. + Vol? dx < 0 


—vàE — oE + curl? A = curl Hap in R^XQ 


where u,v, and c are physical parameters (e.g., o — 0 
in the vacuum). To match the inside and the outside 
of Q, appropriate boundary conditions are required. 

From a point of the gauge theory as in the section 
“Ginzburg-Landau equations in R?," the following 
time-dependent equations in the whole space are 
also considered: 


(8, --i9)^w — DAW = «?(1— [w|^)w 
—ÓE + curl? A = Im(V* DAY) 
-V.E = Im(V*(8, + i9) V) 


Other Topics 


In realistic problems, a superconducting sample 
contains impurities. This inhomogeneity is usually 
expressed by putting a variable coefficient into the 
Ginzburg-Landau energy and the equations. Such a 
model with a variable coefficient is useful in studies 
for pinning of vortices, Josephson effect through an 


inhomogeneous media, etc. A model in a thin film 
with variable thickness is also described by the 
Ginzburg-Landau equations with a variable coeffi- 
cient. Since the Ginzburg-Landau equations (or a 
modified model) can be considered in various settings, 
more applications to realistic problems would be 
treated by the development of nonlinear analysis. 


See also: Abelian Higgs Vortices; Bifurcation Theory; 
Evolution Equations: Linear and Nonlinear; High Te 
Superconductor Theory; Image Processing: 
Mathematics; Integrable Systems: Overview; Interacting 
Stochastic Particle Systems; Ljusternik-Schnirelman 
Theory; Nonlinear Schródinger Equations; Quantum 
Phase Transitions; Variational Techniques for 
Ginzburg-Landau Energies. 
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Introduction 


Many macroscopic systems if left to evolve in 
isolation or in contact with a bath, are able to 
relax, after a finite time, to history-independent 
equilibrium states characterized by time-independent 
values of the state variables and time-translation 
invariance correlations. In glassy systems, the relaxa- 
tion time becomes so large that equilibrium behavior 
is never observed. On short timescales, the micro- 
scopic degrees of freedom appear to be frozen in 
far-from-equilibrium disordered states. On longer 
timescales slow, history-dependent, off-equilibrium 
relaxation phenomena become detectable. 

The list of physical systems falling in disordered 
glassy states at low temperature is long, just to mention 
a few examples one can cite the canonical case of simple 
and complex liquid systems undergoing a glass transi- 
tion, polymeric glasses, dipolar glasses, spin glasses, 
charge density wave systems, vortex systems in type Il 
superconductors, and many other systems. 

Experimental and theoretical research has pointed 
out the existence of dynamical scaling laws char- 
acterizing the off-equilibrium evolution of glassy 
systems. These laws, in turn, reflect the statistical 
properties of the regions of configuration space 
explored during relaxation. 

The goal of a theory of glassy systems is the 
comprehension of the mechanisms that lead to the 
growth of relaxation time and the nature of 
the scaling laws in off-equilibrium relaxation. 
A well-developed description of glassy phenomena 
is provided by mean-field theory based on spin glass 
models, which gives a coherent framework that is 
able to describe the dynamics of glassy systems and 
provides a statistical interpretation of glassy relaxa- 
tion. Despite important limitations of the mean-field 
description for finite-dimensional systems, it allows 
precise discussions of general concepts such as 
effective temperatures and configurational entropies 
that have been successfully applied to the descrip- 
tion of glassy systems. 

In the following, examples of two different ways of 
freezing will be discussed: spin glasses, where 
disorder is built in the random nature of the coupling 
between the dynamical variables, and structural 
glasses, where the disordered nature of the frozen 
state has a self-induced character. These systems are 
examples of two different ways of freezing. 


A Glimpse of Freezing Phenomenology 
Spin Glasses 


The archetypical example of systems undergoing the 
complex dynamical phenomena described in this 
article is the case of spin glasses (Fischer and Hertz 
1991, Young 1997). Spin glass materials are 
magnetic systems where the magnetic atoms occupy 
random position in lattices formed by nonmagnetic 
matrices fixed at the moment of the preparation of 
the material. The exchange interaction between the 
spin of the magnetic impurities in these materials is 
an oscillating function, taking positive and negative 
values according to the distance between the atoms. 

Spin glass models (see Spin Glasses, Mean Field 
Spin Glasses and Neural Networks, and Short- 
Range Spin Glasses: The Metastate Approach) are 
defined by giving the form of the exchange 
Hamiltonian, describing the interaction between 
the spins S; of the magnetic atoms. In the presence 
of an external magnetic field 5, the exchange 
Hamiltonian can be written as 


H — — > JiSi- Sj b S, [1] 


i, JEA ic A 


The spin variable can have classical or quantum 
nature. This article will be limited to the physics of 
classical systems. The most common choice in 
models is to use Ising variables $; — +1. The 
couplings /;, which in real material depend on the 
distance, are most commonly chosen to be indepen- 
dent random variables with a distribution with 
support on both positive and negative values. Most 
commonly, one considers either a symmetric bimo- 
dal distribution on [— 1, 1) or a symmetric Gaussian. 
The sums are restricted to lattices A of various types. 
The most common choices are A—Z?^ for the 
Edwards-Anderson model, the complete graph 
A — {(i,s)|t «5 45j—1,..., N] for the Sherrington- 
Kirkpatrick (SK) model, and the Erdos-Renyi ran- 
dom graph for the Viana-Bray (VB) model. 

The presence of interactions of both signs induces 
frustration in the system: the impossibility of 
minimizing all the terms of the Hamiltonian at the 
same time. One then has a complex energy land- 
scape, where relaxation to equilibrium is hampered 
by barriers of energetic and entropic nature. 

Spin glass materials, which have a paramagnetic 
behavior at high temperature, show glassy behavior 
at low temperature, where magnetic degrees of 
freedom appear to be frozen for long times in 
apparently random directions. There is quite a 
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general consensus, based on the analysis of the 
experimental data and the numerical simulations, 
that in three dimensions and in the absence of a 
magnetic field, the two regimes are separated by a 
thermodynamic phase transition at a temperature T; 
where the magnetic response x exhibits a cusp (see 
Figure 1). By linear response, xy is related to the 
equilibrium spin correlation function 


x (60 - 1") 


having denoted by (-) the Boltzmann-Gibbs aver- 
age. A cusp in x indicates a second-order transition 
where the so-called Edwards-Anderson parameter 
q — (1/N) »» a becomes different from zero, 
indicating freezing of the spins in random directions. 
In the presence of a magnetic field, although the 
low-temperature phenomenology is similar to 
the one at zero field, the thermodynamic nature of 
the freezing transition is more controversial. Theo- 
retically, mean-field theory, based on the SK model, 
predicts a phase transition with a cusp in the 
susceptibility both in the absence and in the 
presence of a magnetic field. Unfortunately, no 
firm theoretical result is available on the existence 
and the nature of phase transitions in finite- 
dimensional spin glass models which is a completely 
open problem. 
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Figure 1 Magnetic susceptibility as a function of temperature 
in spin glass materials. Reproduced from Fischer KH and 
Hertz JA (1991) Spin Glasses. Cambridge, UK: Cambridge 
University Press. 


Structural Glasses 


Analogous freezing of dynamical variables is 
observed in a variety of systems. Some of them 
share with the spin glasses the presence of quenched 
disorder; in many others, this feature is absent. This 
is the case of structural glasses (Debenedetti 1996). 

Many liquids under fast enough cooling, instead 
of crystallizing, as dictated by equilibrium thermo- 
dynamics, form glasses. Simple liquids can be 
modeled as classical systems of particles with 
pairwise interactions. In the simplest example of a 
monoatomic liquid, the potential energy of a 
configuration is then written as 


tN) = 》 (ri — rj) n 


Ic] 


Ví(ri,. 


In the case of atomic mixtures, the potential ó 
acquires a dependence on the species of the inter- 
acting atoms. 

Liquids can be characterized as good or bad glass 
formers depending on the facility by which they 
form glasses. In good glass formers, in order to 
avoid crystallization, it is in general sufficient to 
cross the region around the liquid-crystal transition 
point fast enough, so that the systems can set in a 
supercooled liquid metastable equilibrium. On low- 
ering the temperature, the supercooled liquid 
becomes denser and more viscous while the relaxa- 
tion time of the system, related to the viscosity 
through the Maxwell relation 7—5/G (G is the 
instantaneous shear modulus of the liquid), under- 
goes a rapid growth. One defines a conventional 
glass transition temperature T, as the point where 7 
takes the solid-like value 7 = 10'? Poise, correspond- 
ing to a relaxation time 7 © 100s. After that point, 
the system falls out of equilibrium; under usual 
experimental conditions, it does not have the time to 
adjust to external solicitations and behaves mechani- 
cally like a solid. The glass transition temperature is 
then characterized as the point where the liquid goes 
out of equilibrium, the relaxation time becomes 
larger than the external timescale and the positions 
of the atoms appear as frozen on that scale. 

A great effort has been devoted to understand the 
behavior of the temperature dependence of the 
relaxation time and the nature of the dynamical 
processes in supercooled liquids. In deeply super- 
cooled liquids, the empirical behavior of the relaxa- 
tion time ranges from the Arrhenius form for 
“strong glasses" 7 ~ exp(A/T) to the Vogel-Fulcher 
form 7(T) ~ exp(D/(T — To)) for “fragile glasses." 
The Vogel-Fulcher law predicts a finite-temperature 
divergence of the relaxation time at the temperature 
To. Unfortunately, in typical cases, the Tp results are 
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estimated to be 10-15% lower then T, so that it is 
not possible to verify the law close enough to To to 
support the divergent behavior. 

As a consequence of freezing, one observes 
important qualitative changes in the behavior of 
thermodynamic quantities similar to those encoun- 
tered in equilibrium phase transitions. In a narrow 
interval around T,, specific heat and compressibility 
undergo jumps from liquid-like values to much 
lower solid-like values. 


Aging and Slow Dynamics 


While the crudest picture of the glass transition 
describes freezing as complete structural arrest, both 
for the cases where the glass transition is a gen- 
uine off-equilibrium phenomenon, as in structural 
glasses, and in the case where it has a thermo- 
dynamical character as in spin glasses, the study of 
dynamical quantities reveals the existence of persist- 
ing, history-dependent, slow relaxation processes in 
the frozen phase (Norblad and Svendlidh 1997). 
This is the phenomenon of aging, which is a 
constitutive feature of the glassy state. Its theoretical 
analysis occupies a central theoretical role in the 
comprehension of the way glassy systems explore 
configuration space. A first characterization of 
relaxation is given by the behavior of “one-time 
quantities” like internal energy, density, etc., which 
slowly evolve in the course of time towards values 
corresponding to states of lower free energy. More 
interesting is the behavior of “two-time quantities,” 
time-dependent correlation functions and responses, 
which reveal the deep off-equilibrium nature of 
glassy relaxation. In experimental, numerical, and 
theoretical studies, a special position is occupied by 
the linear response function. Using the language of 
magnetic systems, apt to the spin glasses, one 
considers the response of the magnetization to an 
applied magnetic field. To deal with other systems, 
different conjugated couples of variables are con- 
sidered and simple changes of language are needed. 
Linear perturbations allow to reveal the dynamics of 
the systems without affecting its evolution. Denoting 
by M(t) the magnetization at a time £ and by P(') 
the magnetic field at time 7', the instantaneous linear 
response function is defined as 


n (Mtt) 
CM j 


Measures of the time integral of R(t,t) are com- 
monly performed to reveal the presence of aging in 
glassy systems. Aging is usually studied observing the 
dynamics that follows a rapid quench from high 


temperature, at an instant that marks the origin of 
time. One can reconstruct the response function 
measuring the zero-field-cooled (ZFC) magnetization 
as the response to a magnetic field acting from a 
waiting time tw to the measuring time f, 


t 
xaicit, s) = | dt’ R(t, t’) [4] 
ty 


or its complement, the thermoremanent magnetization 
(TRM) corresponding to the response to a magnetic 
field acting from the time of the quench up to ty 


XTRM (É, tw) =| dt R(t, t’) [5] 


In Figure 2, the behavior of the susceptibility xzrc is 
shown as a function of t — tw in a typical example 
of aging experiment at low temperature. Out-of- 
equilibrium behavior is manifest in the dependence 
of the curves on the waiting time tw. The relaxation 
appears slower and slower for larger waiting times, 
and the ty dependence does not disappear even for 
very large times. Two nontrivial dynamical regimes 
can be identified: a first regime for small £ — ty, that 
is, t — ty << ty where the relaxation is independent 
of tw and a second regime roughly valid for t — ty ~ tw 
where time-translation invariance is manifestly vio- 
lated. The analysis of experimental and simulation data 
shows a scale-invariant behavior according to which 
curves corresponding to different waiting times can be 
superimposed rescaling the time difference t — ty witha 
suitable t,-dependent relaxation time 7(£,). This is a 
growing function of ty which seems to diverge for large 
tw. Up to the waiting times where it has been possible to 
test the relation, 7(tw) behaves as a power T(tw) ~ t2 
where in different materials and models, a = 0.8-0.9. 
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Figure 2 ZFC magnetization in an aging experiment. The 
curves, from bottom to top, correspond to increasing waiting 
times. Reproduced from Norblad P and Svendlidh P (1997) 
Experiments in spin glasses. In: Young AP (ed.) Spin Glasses 
and Random Fields. Singapore: World Scientific, with permission 
from World Scientific Publishing Co. Pte Ltd. 
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Many efforts have been devoted to the compre- 
hension of the scaling laws in aging (Bouchaud 
et al.). Among the theories and models of aging that 
have been proposed, one can cite the phenomeno- 
logical model known as *trap model," developed by 
Bouchaud and collaborators that assimilates aging 
to a random walk between “traps” characterized by 
a broad distribution of trapping times. Suitable 
choices of the trapping-time distribution allow to 
derive scaling laws similar to the ones characteristic 
of aging systems. A different theory, the “droplet 
model" for spin glasses assimilates aging phenomena 
to the competition between slowly growing domains 
of equilibrium phases, in analogy with the dynamics 
of phase separation in first-order phase transitions. 
The approach that has led to the most detailed and 
spectacular predictions has been the study of 
microscopic mean-field models. 


Mean-Field Models of Disordered 
Systems 


Mean-field theory starts from the analysis of the 
relaxation dynamics of disordered systems with 
weak long-range forces (Bouchaud et al.). The 
reference model of spin glass mean-field theory is 
the so called p-spin model, which considers N spins 
S; with random p-body interactions with each other 
and is described by the Hamiltonian 


M l E i Si [6] 


where the quenched coupling constants 万 im are 
assumed to be i.i.d. Gaussian variables with = 
average and N dependent variance El = 
p!/2N?-!, The case p —2 coincides with the m 
model defined in the introduction. The reason for 
considering the p-spin generalization is that the 
order of the transition passes from the second one 
for p=2 to the first one for p > 3 and that this last 
case has been suggested to provide a mean-field limit 
for the structural glass transition. It is also useful to 
define Hamiltonians 


H[S| = » ， ag Hp |S} [7] 
p>1 


that mix p-spin Hamiltonians for different p. These 
are random Gaussian functions of the spin variables, 
with covariance induced by the coupling distribution 


E[H(S)H(S)] = rik q(S, S^) 
= Naq (S, S^ [8] 


p>1 


where the function 


q(S, S) = OE 


is the overlap between configurations. A crucial 
hypothesis in the study of relaxation in spin systems 
is that any local spin update rule verifying the detailed 
balance condition with respect to the Boltzmann- 
Gibbs measure gives rise to the same long-time 
properties. In this perspective, in Monte Carlo simula- 
tions, it is convenient to use Ising spins with 
Metropolis or Glauber dynamics. Much theoretical 
progress has been achieved considering spherical 
models where the spin variables are real numbers 
subject to a global spherical constraint $7, $2? = N and 
evolve according to the following Langevin dynamics: 


dS; S 
d E TA —u()Sit)--m(t) O) 


where s(t) is a time-dependent multiplier that at 
each instant of time insures that the spherical 
constraint is respected, and 7;(t) is a thermal white 
noise with variance 


E(ni(t)nj(s)) 


In order to model the quench from high temperature 
performed in experiments, the initial conditions are 
randomly chosen with uniform probability. To 
describe long but finite-time dynamics, it is neces- 
sary to consider the limit of large volume N — oo for 
finite time, which is the only case where one can 
have infinite relaxation times. Application of func- 
tional Martin-Siggia-Rose techniques has allowed 
the derivation of closed integro-differential equa- 
tions for the spin autocorrelation function 


eps = Jim ose 


and the response to an impulsive external field 


tim 5 Seach) 


where the average has to be intended on quenched 
disordered couplings, initial conditions, and realiza- 
tion of thermal noise. Unfortunately, in the case 
p=2 relevant for spin glass phenomenology, the 
spherical constraints reduce the model to a linear 
system where different eigenmodes of the interaction 
matrix /;; evolve independently. This oversimplifica- 
tion renders the model similar to systems apt to 
describe phase separation rather then freezing 
phenomena. Many of the glassy features of the SK 
model however are captured by a mixture of p=2 


SA Tój;ó(t — s) [10] 


hilt), R(t, t") 
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with p=4 Hamiltonians and f(q) =(1/2)(q* + aq*). 
For the general Hamiltonian [7] one gets the 
coupled equations 


9C(t, t' 
( 5 ) — — u(t)C(t,t') 
+ | a PCERE met. rn) 
0 
+ | at PICE ERG, Y) 
t 
OR(t. t 
nt ) 一 MORM) 


t 
+ |] de” FCEE RETIRE ey — [11] 
t’ 
u(t) is a multiplier that at each instant of time 
insures the spherical constraint C(t,t)=1, and is 
determined by 


u(t) = [ dt” C(t, t )R(t, 1") + T [12] 


In the next sections, we will discuss how these 
equations describe dynamical freezing at low tem- 
perature. The gross features are determined by the 
form of the function f(q). Two main behaviors can 


be identified: 


1. Systems of type I. This behavior is found if 
f" (q) nf" (a) is a monotonically decreasing 
eres of q. To this family belongs the pure 
spherical p-spin model for p > 3, and one finds a 
dynamical transition not corresponding to a 
point of singularity in the free energy where the 
Edwards-Anderson parameter jumps discontinu- 
ously to a nonzero value. Models of this family 
have been proposed as appropriate mean-field 
limits for structural glass behavior. 

2. Systems of S. II. This behavior is found if 
f" (q)/(f"(q) is a monotonically increasing 
function 2 q. This family mimics the behavior 
of the SK model. An example of function f 
verifying the condition for type II behavior is 
f(q) =1/2(q* + aqt) for sufficiently small but 
positive values of a. In this case, the dynamical 
transition is found at a point of second-order 
singularity of the free energy and the Edwards- 
Anderson parameter is continuous at the transi- 
tion. Models of this family provide a mean-field 
limit for spin glass type behavior. 


Equilibrium Dynamics at High Temperature 


At high temperature, after a finite transient, eqn [11] 
describes equilibrium behavior. In these conditions, 


time-translation invariance holds C(t, t) = C(t — t’), 


R(t,t') - R(t —?') while the Lagrange multiplier p 
becomes time independent. In addition, correlation 
and response are related by the fluctuation- 
dissipation theorem (FDT) relation 
1 dC(t) 
Em 1 
R() - - 4 a3 


Ergodic behavior is manifest in the fact that the 
dynamics decorrelate completely; lim, ,= C(t)=0. 
Then from [11] one gets the equilibrium equation: 


dC) — reu ) 0 


dt -F f ere ps 


It is worth noticing that this equation, apart from an 
irrelevant inertial term, coincides for type I systems 
with the schematic mode-coupling theory (MCT) 
equation which has been successfully used to 
describe moderate supercooled liquids (Goetze 
1989). In the context of liquid theory, mode- 
coupling equations stem from an approximate 
treatment for the dynamical evolution of the 
density—density space and time-dependent correla- 
tion function. The schematic MCT equations con- 
sider an equation for a single mode, neglecting any 
space dependence of the correlator. 

Both in type I and in type II systems, eqn [14] 
displays a dynamical transition at a finite tempera- 
ture T. where the relaxation time diverges as a 
power law 7 ^ |T — T,.|° and the asymptotic value of 
the correlation acquires a nonzero value. 

This behavior in type I systems represents a failure 
of MCT to describe the temperature dependence of 
the relaxation time in supercooled liquids, which, as 
previously observed, empirically follows the Vogel- 
Fulcher law. The MCT temperature is interpreted as 
a singularity which is avoided in supercooled 
liquids, thanks to relaxation mechanisms specific of 
short-range systems. It has been noticed that this 
singularity at T. can be associated to the growth of 
spatial heterogeneities and dynamical correlations, 
as exemplified in the behavior of the four-point 
function 


[14] 


)= yt 


and its associate correlation length (Franz and Parisi 
2000, Biroli and Bouchaud 2004). 


Sj(t)S;(0)S;(0)) 


Off-Equilibrium Dynamics Below T7.: Aging 
and Slow Dynamics 


Type I systems Below the transition temperature 
T. slow dynamics and aging set in. In 1993, 
Cugliandolo and Kurchan found a long-time 
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solution to the equations of motion [11] for type I 
systems describing an asymptotic off-equilibrium 
state that follows from high-temperature quench. 
Soon after, type II systems were also analyzed 
(Bouchaud et al.). 

The equations can be analyzed in the limit in 
which both times tend to infinity t, t — oo. In this 
regime all “one-time quantities," that is, state 
functions like energy, magnetization, etc., reach 
asymptotic time-independent limit. Though the 
decay to the asymptotic value cannot read directly 
from the analysis of the equations in that limit, 
numerical and theoretical evidence suggests that the 
final values are approached as power laws in time. 

The study of correlation and response functions 
displays an asymptotic scaling behavior similar to 
the one observed in glassy systems in laboratory and 
numerical experiments. 

Two different interesting regimes are found, first of 
all there is a stationary regime: the limit £, tw — oo is 
performed keeping the difference t — ty — s finite. In 
this regime, equilibrium behavior is observed, with 
correlation and response related by the FDT relation 
Ra4(s) = —80C4(s)/Os. The stationary regime is fol- 
lowed by an aging regime, where correlations decay 
below the value gga = lim, . Cst(s) down to zero. 
One of the most striking features of aging evolution is 
that the system — though at a decreasing speed - 
constantly move far apart from any visited region of 
configuration space. The decay of correlations is 
nonstationary and takes place on a timescale T(tw) 
diverging for large ty. While the theory can infer the 
existence of the timescale 7(f,), its precise form 
remains undetermined. This is a consequence of an 
asymptotic invariance under monotonous time repar- 
ametrizations t— g(t) appearing for large times. 
Coherently with nonstationary behavior, other equi- 
librium properties break down in the aging regime. 
Correlation and response which do not verify the 
FDT are rather asymptotically related by a general- 
ized form of the fluctuation-dissipation relation 


Rag(t, tw) = ee ee [15] 


This relation, despite predicting the vanishing of 
the instantaneous response, implies a finite contribu- 
tion of the aging dynamics to the value of the 
integrated ZFC and TRM responses. The constant 
X, called fluctuation-dissipation ratio (FDR), is a 
temperature-dependent factor monotonically vary- 
ing between the values 1 and 0 as the temperature is 
decreased from T. down to zero. Violations of the 
FDT have to be expected in any off-equilibrium 
regime; however, a constant ratio between response 


and derivative of the correlation is very nongeneric. 
It is of great theoretical importance that the same 
constant that governs the FDR among spin auto- 
correlation and magnetic response, also appears in 
the relation of any other conceivable couple of 
correlation and conjugated response in the system. 
Slow dynamics can be interpreted as motion 
between finite-life metastable states with well- 
defined free energy f and exponential multiplicity 


exp(NX(f). The FDR verifies the generalized 
thermodynamic relation 

Os Xx 

of T [16] 


This relation is in turn intimately related to the 
possibility of considering the ratio T, — T/X as 
an effective temperature, that governs the 
heat exchanges among slow degrees of freedom 
(Cugliandolo et al. 1997). Slow degrees of freedom 
do not exchange heat with the fast ones, but they 
are in equilibrium between themselves at the 
temperature T.. The validity of relation [16] has 
been put at the basis of a detailed statistical 
description of the glassy state (Franz and Virasoro 
2000, Biroli and Kurchan 2001, Nieuwenhuizen 
2000) which assumes that metastable states with 
equal free energy are encountered with equal 
probability during the descent to equilibrium. 
Modified thermodynamic relations follow, that 
condensate all the dependence on the thermal 
history in the value of the effective temperature. 
Given the interest of a thermodynamic description 
of the glassy state, many numerical studies have 
addressed the problem of the identification and 
determination of effective temperatures from the 
fluctuation-dissipation relations, and its relation 
with configurational entropy. In Figure 3 the result 
of a numerical study on a realistic system is 
presented, verifying relation [15]. Experimental 
verifications are at the moment starting and new 
results are waited in the future. 


Type II systems In these systems the dynamic 
transition occurs at the point of thermodynamic 
singularity, where the Edwards-Anderson parameter 
becomes nonzero in a second-order fashion. The 
magnetic susceptibility exhibits a cusp singularity 
similar to the one found in spin glass materials. 
Differently from type I systems, one-time quantities 
tend to their equilibrium values for long times. The 
off-equilibrium nature of the relaxation shows up in 
the behavior of correlations and responses, which 
display aging behavior. 

Their behavior generalizes the one found in type I 
systems, with a more complex pattern of violation 
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Figure 3 Fluctuation-dissipation plot; xzFc(t, ty) vs. C(t, tw) in 
a model of Lennard-Jones glass for different values of the 
waiting time ty. The slope of the curves is equal to the finite-time 
FDR divided by the temperature. One observes the characteristic 
shape of type-| systems with an FDR equal to 1 in the stationary 
regime for high correlations and equal to a constant smaller then 
1 in the low-correlations aging regime. Reproduced from Kob W 
and Barrat J-L (1999) Europhysics Letters 46: 637, with 
permission from EDP Sciences. 


of time-translation invariance and FDT. Also in this 
case a short-time equilibrium behavior can be 
identified where the correlation decreases from 1 to 
qra and a long-time inhomogeneous aging behavior 
where correlations decrease to zero. Differently from 
type I system it is impossible to characterize aging 
through a unique timescale 7(tw). One finds instead 
a continuum of timescales hierarchically organized. 
The analysis of the equations at the 
reparametrization-invariant level reveals the existence 
of a continuum of separate timescales T(tw,q) asso- 
ciated to each value of C(t,ty)=g < gea and that 
limi, so T(tws 7)/T(tw, q') = 0 for q > q', meaning that 
for finite ty, the time to decay to q’ is much larger than 
the time to decay to q. For large times, 1 << tı << 
t; << tz, the correlations verify the ultrametric prop- 
erty C(t3,t1)= min[C(t3, t2), C(t, t1)]. To each time- 
scale corresponds in this case a different effective 
temperature, and correlation and response are related 
by the equation 
l R(t, t’) 

(4) ed OC(t, t')/Ot' d 
where the function X(q) is an increasing function of 
q with the properties of a cumulative probability 
distribution. In fact it can be seen (Franz et al. 1999) 
that this is related to the Parisi overlap probability 
function describing the correlations among ergodic 
components at equilibrium, in a generalization of 
relation [14]. Figure 4 shows the result of a 


0 0.4 0.6 0.8 1 


Figure 4 Fluctuation—dissipation plot in a three-dimensional 
spin glass at low temperature. As predicted for type-II systems, 
the FDR is an increasing function of the correlation, constant 
only in the stationary part of the relaxation. Reproduced from 
Marinari E et al. (1998) Violation of the fluctuation dissipation 
theorem in finite dimensional spin glasses. Journal of Physics A: 
Mathematical and General 31: 2611, with permission from 
Institute of Physics Publishing Ltd. 


numerical experiment in a three-dimensional spin 
glass, where X(q) is not piecewise constant. 

The ideas presented in this article, fruits of mean-field 
theory of disordered systems, are objects of intense 
debate in their application to the physics of short-range 
systems. Many of the relations derived have stimulated 
a lot of numerical, experimental, and theoretical work. 
Some of the predictions of the theory are very well 
verified in many short-range glassy systems, at least on 
the accessible timescales. Notably, the violations of 
FDR, and the possibility to associate the values of the 
FDR to effective temperatures is very well verified both 
in structural glass models, and in finite-range spin 
glasses. Since finite aging times imply finite length scales 
over which the dynamic variables can exhibit correlated 
behavior, this indicates that the mean-field theory is at 
least good at describing glassy phenomena on a local 
scale. The question if the mean-field theory also gives a 
good description on the infinite time limit and the 
anomalous response persists forever is at present an 
open theoretical problem. It relates to the possibility of 
having mean-field type of equilibrium ergodicity break- 
ing, which is an open question, object of active research. 


See also: Interacting Stochastic Particle Systems; 
Short-Range Spin Glasses: The Metastate Approach; 
Spin Glasses. 
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Definitions 
Graded Vector Spaces 


By a Z-graded vector space (or simply, graded 
vector space) we mean a direct sum A = Ẹ®;je zA; of 
vector spaces over a field k of characteristic zero. 
The A; are called the components of A of degree ; 
and the degree of a homogeneous element a €A is 
denoted by |a|. We also denote by A[z] the graded 
vector space with degree shifted by n, namely 
Ala] = G;ez(A[n]; with (A[n];— Aj,,. The tensor 
product of two graded vector spaces A and B is 
again a graded vector space whose degree r 
component is given by (A & B), = @pig=7 Ap & Bq., 

The symmetric and exterior algebras of a graded 
vector space A are defined, respectively, as $(A) = 
T(A)/Is and A(A) - T(A)/I,, where T(A) — ®n>0 
A *" is the tensor algebra of A and Is (resp. I, ) is the 
two-sided ideal generated by elements of the form 
a@b—=(-1)"lb@a (resp. a&b + (C1) bga), 
with a and b homogeneous elements of A. The images 
of A®” in S(A) and A (A) are denoted by S”(A) and 
A" (A), respectively. Notice that there is a canonical 
decalage isomorphism $"(A[1]) ~ A "(A)[n]. 
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Graded Algebras and Graded Lie Algebras 


We say that A is a graded algebra (of degree zero) if 
A is a graded vector space endowed with a degree 
zero bilinear associative product: AG A— A. A 
graded algebra is graded commutative if the product 
satisfies the condition 


a: b = (-1)" hlb. a 


for any two homogeneous elements a,b€A of 
degree |a| and |b|, respectively. 

A graded Lie algebra of degree n is a graded 
vector space A endowed with a graded Lie bracket 
on A[z]. Such a bracket can be seen as a degree —n 
Lie bracket on A, that is, as a bilinear operation 
{-,-}:A@A—A|-—n] satisfying graded antisymme- 
try and graded Jacobi relations: 


(a, b) hi —(—1)(4 ms rp, 23 


(a, (b, c)) = {{a,b}, c) + (71) 90019 Gab. ch} 


Graded Poisson Algebras 


We can now define the main object of interest of 
this note: 


Definition 1 A graded Poisson algebra of degree n, 
or n-Poisson algebra, is a triple (A,-,{,}) consisting 
of a graded vector space A = &;- z A; endowed with 
a degree zero graded commutative product and with 


a degree —7 Lie bracket. The bracket is required to 
be a biderivation of the product, namely: 


{a,b-c} = (a,b) -c + (-1)9 4995 , {a,c} 


Notation. Graded Poisson algebras of degree zero 
are called Poisson algebras, while for n=1 one 
speaks of Gerstenhaber (1963) algebras or of 
Schouten algebras. 

Sometimes a Z -grading is used instead of a 
Z-grading. In this case, one just speaks of even and 
odd Poisson algebras. 


Example 1 Any graded commutative algebra can 
be seen as a Poisson algebra with the trivial Lie 
structure, and any graded Lie algebra can be seen as 
a Poisson algebra with the trivial product. 


Example 2 The most classical example of a 
Poisson algebra (already considered by Poisson 
himself) is the algebra of smooth functions on R” 
endowed with usual multiplication and with the 
Poisson bracket (f, 8} = Qf Op,.g — 0, g0y,f , where the 
p; s and the q"s, for i= 1,...,7, are coordinates on 
R^", The bivector field ôy \ Op, is induced by the 
symplectic form w=dp; ^ dd'. An immediate gener- 
alization of this example is the algebra of smooth 
functions on a symplectic manifold (R*”,w) with the 
Poisson bracket {f,g}=w0;f0)g, where wð; ^ 0; is 
the bivector field defined by the inverse of the 
symplectic form w — wj;dx! ^ dx; viz. ww = 6f. 

A further generalization is when the bracket on 
C* (R") is defined by [(f,g)]—o"Ojfüjg, with the 
matrix function œ not necessarily nondegenerate. 
The bracket is Poisson if and only if o is skewsym- 
metric and satisfies 


all ;a* + aða + apa =0 


An example of this, already considered by Lie 
(1894), is ax} = fix, where the fps are the 
structure constants of some Lie algebra. 


Example 3 Example 2 can be generalized to any 
symplectic manifold (M,w). To every function 
bc€C*(M) one associates the Hamiltonian vector 
field X, which is the unique vector field satisfying 
ix,w=dh. The Poisson bracket of two functions 
f and g is then defined by 


Ug] = ixix,o 


In local coordinates, the corresponding Poisson 
bivector field is related to the symplectic form as in 
Example 2. 


A generalization is the algebra of smooth 
functions on a manifold M with bracket 
{f,2}=(aldf ^dg), where «a is a bivector field 
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(i.e. a section of A?TM) such that {a,a}sn — 0, 
where {-,-}cy is the Schouten-Nijenhuis bracket 
(see the first subsection in the next section for 
details, and Example 2 for the local coordinate 
expression). Such a bivector field is called a Poisson 
bivector field and the manifold M is called a Poisson 
manifold. Observe that a Poisson algebra structure 
on the algebra of smooth functions on a smooth 
manifold is necessarily defined this way. In the 
symplectic case, the bivector field corresponding to 
the Poisson bracket is the inverse of the symplectic 
form (regarded as a bundle map TM — T* M). 

The linear case described at the end of Example 2 
corresponds to M=q* where q is a (finite- 
dimensional) Lie algebra. The Lie bracket A?a— g 
is regarded as an element of q& A?g* C I'(A?Tg*) 
and reinterpreted as a Poisson bivector field on q. 
The Poisson algebra structure restricted to polyno- 
mial functions is described at the beginning of the 
next section. 


Batalin-Vilkovisky Algebras 


When n is odd, a generator for the bracket of an 
n-Poisson algebra A is a degree —7 linear map from 
A to itself, 


A: A—A|-n| 


such that 
A(a- b) = A(a)-b + (-1)"la- A(b) + (—1)" (a, b] 


A generator A is called exact if and only if it satisfies 
the condition A* — 0, and in this case A becomes a 
derivation of the bracket: 


A({a,b}) = (A(a), b) + (-1)"* (a, A(b)) 


Remark 1 Notice that not every odd Poisson 
algebra A admits a generator. For instance, a 
nontrivial odd Lie algebra seen as an odd Poisson 
algebra with trivial multiplication admits no gen- 
erator. Moreover, even if a generator A for an odd 
Poisson algebra exists, it is far from being unique. In 
fact, all different generators are obtained by adding 
to A a derivation of A of degree —n. 


Definition 2 An z-Poisson algebra A is called an 
n-Batalin-Vilkovisky algebra, if it is endowed with 
an exact generator. 


Notation. When n=1 it is customary to speak 
of Batalin-Vilkovisky algebras, or simply BV alge- 
bras (see Batalin-Vilkovisky Quantization; see also 
Batalin and Vilkovisky (1963), Getzler (1994), and 
Koszul (1985)). 
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There exists a characterization of »-Batalin- 
Vilkovisky algebras in terms of only the product and 
the generator (Getzler 1994, Koszul (1985). Suppose 
in fact that a graded vector space A is endowed with a 
degree zero graded commutative product and a linear 
map A:A— A[-7] such that A? — 0, satisfying the 
following *seven-term" relation: 


A(a-b-c) + A(a)- bc + (—1)"la- A(b)-c 
+ (71) 9*5. b. Ale) 
= A(a-b)-c4- (—1)"a- A(b- c) 
4- (—1) HDb. A (a. c) 


In other words, A is a derivation of order 2. 
Then, if we define the bilinear operation 
{,}:A@A—A[-—n] by 


(a,b) — (—1)^ (Ala -b) — A(a) -b 
~(-1)"a- A(b)) 


we have that the quadruple (A,-,{,},A) is an 
n-Batalin-Vilkovisky algebra. Conversely, one easily 
checks that the product and the generator of an 
n-Batalin-Vilkovisky algebra satisfy the above 
“seven-term” relation. 


Examples 
Schouten-Nijenhuis Bracket 


Suppose q is a graded Lie algebra of degree zero. 
Then A = S(a[»]) is a (—7)-Poisson algebra with its 
natural multiplication (the one induced from the 
tensor algebra T(A)) and a degree —n bracket 
defined as follows (Koszul 1985, Krasil'shchik 
1988): the bracket on S'(a[n]) - a[n] is defined as 
the suspension of the bracket on à, while on 
S*(a[n]), for k > 1, the bracket, often called the 
Schouten-Nijenhuis bracket, is defined inductively 
by forcing the Leibniz rule 


(a, b.c) = (a,b) -c + (-1) 904 95. {a,c} 
Moreover, when 7 is odd, there exists a generator 
defined as - 

A(a1 -a2--+ ax) 

= 3 '(-1)'(aiaj) dh ded 
i<j 
where ai,...,ak EQ and e=|a;|+(|a;|+1)(\a,|+---+ 
laj-1| 4-2 — 1) - (la;| 4- 1)(]a1 | ++ laj| +-+ laj-i| + 
j—2). An easy check shows that A?=0, thus 


S(a[n] is an m-Batalin-Vilkovisky algebra for 
every odd »€N. For n=—1 the A-cohomology 


on Aa is the usual Cartan-Chevalley-Eilenberg 
cohomology. 

In particular, one can consider the Lie algebra 
g = Der(B) = jez Der'(B) of derivations of a graded 
commutative algebra B. More explicitly, Der’ (B) con- 
sists of linear maps à: B— B of degree j such 
that ó(ab) — o(a)b + (-1Y "aó(b) and the bracket 
is (6, v] 2 $ ow — (1)? o $. The space of multi- 
derivations S(Der(B)[—1 |), endowed with the Schouten- 
Nijenhuis bracket, is a Gerstenhaber algebra. 

We can further specialize to the case when B is the 
algebra C* (M) of smooth functions on a smooth 
manifold M; then X(M) = Der(C™ (M)) is the space of 
vector fields on M and V(M)— S(X(M)[—1]) is the 
space of multivector fields on M. It is a classical 
result by Koszul (1985) that there is a bijective 
correspondence between generators for V(M) and 
connections on the highest exterior power 
ASM TM of the tangent bundle of M. Moreover, 
flat connections correspond to generators which 
square to zero. 


Lie Algebroids 


A Lie algebroid E over a smooth manifold M is a 
vector bundle E over M together with a Lie algebra 
structure (over R) on the space T(E) of smooth 
sections of E, and a bundle map p: E — TM, called 
the anchor, extended to a map between sections of 
these bundles, such that 


{X,fY} = HX, Y} + (QUOf)Y 


for any smooth sections X and Y of E and any 
smooth function f on M. In particular, the anchor 
map induces a morphism of Lie algebras 
p. : T(E) — X(M), namely p,({X, Y}) = (o.(X), p.(Y)]. 

The link between Lie algebroids and Gerstenhaber 
algebras is given by the following Proposition 
(Kosmann-Schwarzbach and Monterde 2002, Xu 
1999): 


Proposition 1 Given a vector bundle E over M, 
tbere exists a one-to-one correspondence between 
Gerstenbaber algebra structures on A—Y(A(E)) 
and Lie algebroid structures on E. 


The key of the proposition is that one can extend 
the Lie algebroid bracket to a unique graded 
antisymmetric bracket on I(A(E)) such that 
(X, f) - p(X)f for XeT(A'(E)) and f €eT(A*(E)), 
and that for O € T(A?*' (E)), (Q,-] is a derivation 
of T( A(E)) of degree q. 


Example 4 A finite-dimensional Lie algebra q can 
be seen as a Lie algebroid over a trivial base 
manifold. The corresponding Gerstenhaber algebra 
is the one of last subsection. 


Example 5 The tangent bundle TM of a smooth 
manifold M is a Lie algebroid with anchor map 
given by the identity and algebroid Lie bracket given 
by the usual Lie bracket on vector fields. In this 
case, we recover the Gerstenhaber algebra of multi- 
vector fields on M described in the last subsection. 


Example 6 If M is a Poisson manifold with Poisson 
bivector field a, then the cotangent bundle T*M 
inherits a natural Lie algebroid structure where the 
anchor map a*:T}M-—+T,M at the point p € M is 
given by o* (£)(9) — o(£,19), with £,9€ T; M, and the 
Lie bracket of the 1-forms w; and w2 is given by 


{wi w2} = Ler (w,)W2 — Lat(w) 1 一 dar(w , w2) 
) 


The associated Gerstenhaber algebra is the de Rham 
algebra of differential forms endowed with the 
bracket defined by Koszul (1985). As shown in 
Kosmann-Schwarzbach (1995), r(A (T* M)) is indeed 
a BV algebra with an exact generator A = |d, ia] given 
by the commutator of the contraction ;, with the 
Poisson bivector a and the de Rham differential d. 
Similar results hold if M is a Jacobi manifold. 


It is natural to ask what additional structure on a 
Lie algebroid E makes the Gerstenhaber algebra 
l(A(E)) into a BV algebra. The answer is given by 
the following result, which is proved in Xu (1999). 


Proposition 2 Given a Lie algebroid E, there is a 
one-to-one correspondence between generators for the 
Gerstenbaber algebra T(N (E)) and E-connections on 
A™EE (where rk E denotes the rank of the vector 
bundle E). Exact generators correspond to flat E- 
connections , and in particular, since flat E-connections 
always exist, T(N (E)) is always a BV algebra. 


Lie Algebroid Cohomology 


A Lie algebroid structure on E— M defines a 
differential 6 on T'(A E*) by 


ôf :—-p'df,  fe€C*(M)-r(A"E*) 


and 


(6a, X ^ Y) := (la, X), Y) — (la, Y), X) 
— (a, 1X, Y) X, Y cT(E),a cT(E") 


where 50*:Q'(M)—T(E*) is the transpose of 
p« :F(E) —^ X(M) and (,) is the canonical pairing of 
sections of E* and E. On T'(A"E*), with n > 2, the 
differential ó 1s defined by forcing the Leibniz rule. 

In Example 4 we get the Cartan—Chevalley— 
Eilenberg differential on A a*; in Example 5 we 
recover the de Rham differential on 0*(M)= 
r(A T' M), while in Example 6 the differential on 
V(M) =T'(A TM) is (o, jsN. 
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Lie-Rinehart Algebras 


The algebraic generalization of a Lie algebroid is a 
Lie-Rinehart algebra. Recall that given a commu- 
tative associative algebra B (over some ring R) and a 
B-module à, then a Lie-Rinehart algebra structure 
on (B, à) is a Lie algebra structure (over R) on q and 
an action of g on the left on B by derivations, 
satisfying the following compatibility conditions: 


{y,aa} = *(a)ó + aty, c] 
(ay)(b) = a(7(d)) 


for every a, b € B and 7,0 €Q. 

The Lie-Rinehart structures on the pair (B,q) 
bijectively correspond to the Gerstenhaber algebra 
structures on the exterior algebra Ag(a) of q in 
the category of B-modules. When q is of finite rank 
over B, generators for these structures are in turn 
in bijective correspondence with (B, q)-connections on 
A XB%q, and flat connections correspond to exact 
generators. For additional discussions, see Gerstenhaber 
and Schack (1992) and Huebschmann (1998). 

Lie algebroids are Lie-Rinehart algebras in the 
smooth setting. Namely, if E — M is a Lie algebroid, 
then the pair (C*(M),I(E)) is a Lie-Rinehart 
algebra (with action induced by the anchor and the 
given Lie bracket). 


Lie-Rinehart Cohomology 


Lie algebroid cohomology may be generalized to 
every Lie-Rinehart algebra (B,q). Namely, on the 
complex Altg(q,B) of alternating multilinear func- 
tions on q with values in B, one can define a 
differential 6 by the rules 


(ŝa, y) = x(a), aeB=Altz(g,B), ea 
(ôa, y ^c) = (f(a, y) o) — (6(4,0), y) — (a, 0, oF) 
y,oEg, ac Alt;(g, B) 


and forcing the Leibniz rule on elements of 
Alt? (a, B), n > 2. 


Hochschild Cohomology 


Let A be an associative algebra with product jj, and 
consider the Hochschild cochain complex 
Hoch(A) = |[,.9 Hom(A*", A)|-» +1]. There are 
two basic operations between two elements 
f € Hom(A**, A)[-£ + 1] and g e Hom(A *^, A)[-1 + 
1], namely a degree zero product 


f Ugla &---G ay) 
= (-A)f(ai @+--@ ay) -gla Q- ++ @ apy) 
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and a degree —1 bracket (f,g] - f og — [4 -308—1D) 


gof, where 


f og(a1 Q: -Qak4l-1) 


k—1 
= VY (71) "f(a &---&ai8g(ai 


j=] 


DQ- Qai) 9:41) 


It is well known from Gerstenhaber (1963) that the 
cohomology HHoch(A) of the Hochschild complex 
with respect to the differential dhoch = {14,*} has the 
structure of a Gerstenhaber algebra. More generally, 
there is a Gerstenhaber algebra structure on Hochs- 
child cohomology of differential graded associative 
algebras (Loday 1998). 


Graded Symplectic Manifolds 


The construction of Example 3 can be extended 
to graded symplectic manifolds (see Supermanifolds; 
see also Alexandrov et al. (1997), Getzler (1994), 
and Schwarz (1993)). Recall that a symplectic 
structure of degree n on a graded manifold N is a 
closed nondegenerate 2-form w such that Lew = nw 
where Lg is the Lie derivative with respect to the 
Euler field of N (see Roytenberg (2002) for details). 
Let us denote by X; the vector field associated to the 
function h € C* (N) by the formula ix,» — db. Then 
the bracket 


{f g} = ix;ix,w 


gives C* (N) the structure of a graded Poisson 
algebra of degree n. 

If the symplectic form has odd degree and the 
graded manifold has a volume form, then it is 
possible to construct an exact generator defined by 


A(f) = 1 div(X) 


where div is the divergence operator associated to 
the given volume form (Getzler 1994, Kosmann- 
Schwazbach and Monterde 2002). 

An explicit characterization of graded symplectic 
manifolds has been given in Roytenberg (2002). In 
particular, it is proved there that every symplectic 
form of degree n with n > 1 is necessarily exact. 
More precisely, one has w= d(igw/n). 


Shifted Cotangent Bundles 


The main examples of graded symplectic manifolds 
are given by shifted cotangent bundles. If N is a 
graded manifold then the shifted cotangent bundle 
T*[n|N is the graded manifold obtained by shifting 
by z the degrees of the fibers of the cotangent 
bundle of N. This graded manifold possesses a 


nondegenerate closed 2-form of degree n, which can 
be expressed in local coordinates as 


w= V dx ^ dx; 


where [x'] are local coordinates on N and (xi) are 
coordinate functions on the fibers of T*[z|N. In 
local coordinates, the bracket between two homo- 
geneous functions f and g is given by 


tur OF Og 


{fg} = 一 (一 1) ax! Ox! 


— (A) f+ e+) dle Og of 
Ax! Ox! 


If in addition the graded manifold N is orientable, 
then T*[z]N has a volume form too; when » is odd, 
the exact generator A(f) —(1/2)divX; is written in 
local coordinates as 


9 90 
> Ox! Ox! 


In the case 4 — 1, we have a natural identification 
between functions on T*[1|N and multivector fields 
V(N) on N, and we recover again the Gerstenhaber 
algebra of the subsection “Schouten—Nijenhuis 
bracket." Moreover, it is easy to see that, under 
the above identification, A applied to a vector field 
of N is the usual divergence operator. 


Examples from Algebraic Topology 


For any n > 1, the homology of the n-fold loop 
space Q"(M) of a topological space M has the 
structure of an (n — 1)-Poisson algebra (May 1972). 
In particular, the homology of the double loop space 
Q?(M) is a Gerstenhaber algebra, and has an exact 
generator defined using the natural circle action on 
this space (Getzler 1994). The homology of the free 
loop space £(M) of a closed oriented manifold M is 
also a BV algebra when endowed with the *Chas- 
Sullivan intersection product" and with a generator 
defined again using the natural circle action on the 
free loop space (Cohen and Jones 2002). 


Applications 
BRST Quantization in the Hamiltonian Formalism 


The BRST procedure is a method for quantizing 
classical mechanical systems or classical field the- 
ories in the presence of symmetries (see BRST 
Quantization). The starting point is a symplectic 
manifold M (the “phase space"), a function H (the 
*Hamiltonian" of the system) governing the evolu- 
tion of the system, and the "constraints" given by 


several functions g; which commute with H and 
among each other up to a C™ (M)-linear combina- 
tion of the g;'s. 

Then the dynamics is constrained on the locus V of 
common zeros of the g;’s. When V is a submanifold, 
the g;s are a set of generators for the ideal I of 
functions vanishing on V. Observe that I is closed 
under the Poisson bracket. Functions in I are called 
“first class constraints." The Hamiltonian vector fields 
of first-class constraints, which by construction tan- 
gential to V, are the *symmetries" of the system. 

When V is smooth, then it is a coisotropic 
submanifold of M and the Hamiltonian vector fields 
determined by the constraints give a foliation F of 
V. In the nicest case V is a principal bundle with F 
its vertical foliation and the algebra of functions 
C™ (V/F) on the “reduced phase space" (see Poisson 
Reduction, and Symmetry and Symplectic Reduction) 
V/F is identified with the l-invariant subalgebra of 
C* (M)/I. 

From a physical point of view, the points of V/ are 
the interesting. states at a classical level, and a 
quantization of this system means a quantization of 
C* (V/.F). The BRST procedure gives a method of 
quantizing C~ (V/F) starting from the (known) quan- 
tization of C* (M). Notice that these notions immedi- 
ately generalize to graded symplectic manifolds. 

From an algebraic point of view, one starts with a 
graded Poisson algebra P and a multiplicative ideal | 
which is closed under the Poisson bracket. The 
algebra of functions on the “reduced phase space" is 
replaced by (P/I)', the I-invariant subalgebra of P/I. 
This subalgebra inherits a Poisson bracket even if 
P/I does not. Moreover, the pair (B, a) — (P/I, I/I^) 
inherits a graded Lie-Rinehart structure. The “Rine- 
hart complex" Altp,;(I/I*, P/I) of alternating multi- 
linear functions on 1/1* with values in P/I, endowed 
with the differential described in the subsection 
“Lie-Rinehart cohomology,” plays the role of the de 
Rham complex of vertical forms on V with respect 
to the foliation 7 determined by the constraints. 

In case V is a smooth submanifold, we also have 
the following geometric interpretation: let N*V 
denote the conormal bundle of V (i.e., the annihi- 
lator of TV in TyP). This is a Lie subalgebroid of 
T*P if and only if V is coisotropic. Since we may 
identify I/I? with sections of N*C (by the de Rham 
differential), (P/I,I/I?^) is the corresponding Lie- 
Rinehart pair. The Rinehart complex is then the 
corresponding Lie algebroid complex T(A(N* VÝ) 
with differential described in the subsection “Lie 
algebroid cohomology." The image of the anchor 
map N*V — TV is the distribution determining F, 
so by duality we get an injective chain map from the 
vertical de Rham complex to the Rinehart complex. 
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The main point of the BRST procedure is to define 
a chain complex C° = A(V* $ V)&P, where V is a 
graded vector space, with a coboundary operator D 
(the *BRST operator"), and a quasi-isomorphism 
(i.e. a chain map that induces an isomorphism in 
cohomology) 


T : (C*, D) > (Altp) (I7 /I, P/I), d) 


This means in particular that the zeroth cohomology 
H9? (C) gives the algebra (P/I)! of functions on the 
“reduced phase space." Observe that there is a 
natural symmetric inner product on V*& V given by 
the evaluation of V* on V, This inner product, as an 
element of S^(W @ W*) ~ S2(V) @ (V @ U*) @ S*(U"), 
is concentrated in the component Y @W*, and so 
it defines an element in A?(w[1] @ V*[-1]) > 
S^(V)[2] @ (V & v*) o S?(w*)-2], that is, a degree 
zero bivector field on V[1] & v*[— 1]. It is easy to see 
that this bivector field induces a degree zero Poisson 
structure on S(V*[—1] @ V[1]). From another view- 
point this is the Poisson structure corresponding to 
the canonical symplectic structure on T*[1]. 
Finally, we have that S(v*[-1] o V[1] & P is a 
degree zero Poisson algebra. Note that the super- 
algebra underlying the graded algebra S(V*[—-1] & 
V[1]) & P is canonically isomorphic to the complex 
C= A(V* a V)&SP. When P=C*(M), we can 
think of S(v*[-1] @ V[1]) &C* (M) as the algebra 
of functions on the graded symplectic manifold 
N-(w[1]& V*[-1] xM (the “extended phase 
space"). In physical language, coordinate functions 
on V[1] are called. *ghost fields" while coordinate 
functions on V*[— 1] are called “ghost momenta" or, 
by some authors, “antighost fields" (not to be 
confused with the antighosts of the Lagrangian 
functional-integral approach to quantization). 

Suppose now that there exists an element 
OcS(v*[-1] 5 V[1] &P such that {O,-}=D, that 
one can extend the “known” quantization of P to a 
quantization of S(v*[-1] $ V[1] & P as operators 
on some (graded) Hilbert space 7 and that the 
operator O which quantizes O has square zero. 
Then one can consider the “true space of physical 
states” HO(T) on which the adg-cohomology of 
operators will act. This provides one with a 
quantization of (P/T. 

For further details on this procedure, and in 
particular for the construction of D, we refer to 
Henneaux and Teitelboim (1992), Kostant and 
Sternberg (1987), and Stasheff (1997), and references 
therein. Observe that some authors refer to this 
method as BVF (Batalin—Vilkovisky—Fradkin) and 
reserve the name BRST for the case when the g;'s are 
the components of an equivariant moment map. 
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For a generalization to graded manifolds different 
from (V[1] $ v*[-1]) x M we refer to Roytenberg 
(2002). There it is proved that the element O exists if the 
graded symplectic form has degree different from —1. 


BV Quantization in the Lagrangian Formalism 


The BV formalism (see Batalin-Vilkovisky Quantiza- 
tion; see also Batalin and Vilkovisky (1983) and 
Henneaux and Teitelboim (1992)) is a procedure for 
the quantization of physical systems with symmetries 
in the Lagrangian formalism. As a first step, the 
“configuration space" M of the system is augmented 
by the introduction of “ghosts.” If G is the group of 
symmetries, this means that one has to consider the 
graded manifold W — a[1] x M. The second step is to 
double this space by introducing “antifields for fields 
and ghosts," namely one has to consider the *extended 
configuration space" T*[— 1] W, whose space of func- 
tions is a BV algebra (see the subsection “Shifted 
cotangent bundle.” The algebra of “observables” is by 
definition the cohomology HA(C* (T*[—1]W)) with 
respect to the exact generator A. 


Related Topics 
AKSZ 


The graded manifold T*[—1]W considered above is a 
particular example of a OP-manifold, that is, of a 
graded manifold M endowed with an integrable (i.e., 
self-commuting) vector field O of degree 1 and a graded 
Q-invariant symplectic structure P. In quantization of 
classical mechanical theories, the graded symplectic 
manifold of interest is the space of fields and antifields 
with symplectic form of degree 1, while O is the 
Hamiltonian vector field defined by the action func- 
tional $; the integrability of O is equivalent to the 
classical master equation ($,8] — 0 for the action 
functional. Quantization of the theory is then reduced 
to the computation of the functional integral 
[;exp(iS/b), where £ is a Lagrangian submanifold 
of M. This functional integral actually depends only on 
the homology class of the Lagrangian. Locally, a OP 
manifold is a shifted cotangent bundle T*|—1]N and 
a Lagrangian submanifold is the graph of an exact 
1-form. In the notations of the subsection “Shifted 
cotangent bundle," a Lagrangian submanifold £ is 
therefore locally defined by equations x! = 0@/0x', and 
the function is called a gauge-fixing fermion. The 
action functional of interest is then the gauge-fixed 
action S|; = S(x', Od /Ox'). 

The language of OP manifolds has powerful 
applications to sigma models (see Topological Sigma 
Models): if X is a finite-dimensional graded manifold 
equipped with a volume element, and M is a OP 


manifold, then the graded manifold C™ (X,M) of 
smooth maps from X to M has a natural structure of 
OP manifold which describes some field theory if one 
arranges for the symplectic structure to be of degree 
1. As an illustrative example, if X — T[1|]X, for a 
compact oriented three-dimensional smooth mani- 
fold X, and M — a[1], where q is the Lie algebra of a 
compact Lie group, the OP manifold C* (X, M) is 
relevant to Chern-Simons theory on X. Similarly, if 
X = T[1|X, for a compact oriented two-dimensional 
smooth manifold X and M =T[1]N is the shifted 
tangent bundle of a symplectic manifold, then the OP 
structure on C% (X, M) is related to the A-model with 
target N; if the symplectic manifold N is of the form 
N — T*K for a complex manifold K, then one can 
endow C*(9X,M) with a complex OP manifold 
structure, which is related to the B-model with target 
K; this shows that, in some sense, the B-model can be 
obtained from the A-model by “analytic continua- 
tion” (Alexandrov et al. 1997). If X = T[1]X, for a 
compact oriented two-dimensional smooth manifold 
X and M = T*[1]N with canonical symplectic struc- 
ture, then the OP structure on C* (X, M) is related to 
the Poisson sigma model (OP structures on T*[1|N 
with canonical symplectic structure are in one-to-one 
correspondence with Poisson structures on N). The 
study of OP manifolds is sometimes referred to as 
“the AKSZ formalism". In Roytenberg (2002) OP 
manifolds with symplectic structure of degree 2 are 
studied and shown to be in one-to-one correspon- 
dence with Courant algebroids. 


Graded Poisson Algebras from Cohomology of P~ 


The Poisson bracket on a Poisson manifold can be 
derived from the Poisson bivector field œ using the 
Schouten-Nienhuis bracket as follows: 


{f,8} = (Los f Ion 8hsn 


This may be generalized to the case of a graded 
manifold M endowed with a multivector field o of 
total degree 2 (i.e o —» 7^, aj, where o; is an 
i-vector field of degree 2 — i) satisfying the equation 
(a, o] sy = 0. One then has the derived multibrackets 


with A — C* (M). Observe that A; is a multiderivation 
of degree 2 — i. The operations A; define the structure 
of an Læ -algebra on A. Such a structure is called a 
P.4.-algebra (P for Poisson) since the N's are multi- 
derivations. If Ay — o9 vanishes, then A; is a differ- 
ential, and the A,-cohomology inherits a graded 
Poisson algebra structure. This structure can be used 


“i —— 


to describe the deformation quantization of coisotropic 
submanifolds and to describe their deformation theory. 
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Introduction 


Einstein’s theory of general relativity states that 
gravity attracts light. The deflection angle of a light 
ray by an object with mass m was predicted to be 


[1] 


where c and G are the velocity of light and the 
gravitational constant, respectively, and r is the 
impact parameter. The quantitative measurement of 
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this light deflection at the solar limb during the solar 
eclipse in 1919 with 


4GM; 
c^ Ra 


(here m is replaced by the solar mass Mẹ and the 
impact parameter is the solar radius Rj) confirmed 
Einstein's theory. 

In the decades following this measurement, 
various aspects of the gravitational lens effect were 
explored theoretically, which include (1) the possi- 
bility of multiple or ring-like images of background 
sources, (2) the use of lensing as a gravitational 
telescope on very faint and distant objects, and 
(3) the possibility of determining Hubble's constant 
with lensing. Only relatively recently — after the 


Q = 


a 1.74 arcsec [2] 
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discovery of the first doubly imaged quasar in 1979 — 
gravitational lensing became an_ observational 
science. Today gravitational lensing is a booming 
part of astrophysics. 

Lensing has established itself as a very useful 
astrophysical tool with some remarkable successes: 
with the discovery of multiply-imaged quasars, giant 
luminous arcs, Einstein rings, quasar and galactic 
microlensing significant new results in areas as 
different as cosmology, physics of quasars, and 
galaxy structure could be reached. In this article, 
only the aspects of “strong lensing” can be treated. 
More detailed studies on strong and weak lensing 
can be found in the “Further reading” section. 


Basics of Gravitational Lensing 


The path, the size, and the cross section of a light 
bundle propagating through spacetime in principle 
are affected by all the matter between the light 
source and the observer. For most practical pur- 
poses, we can assume that the lensing action is 
dominated by a single matter inhomogeneity at 
some location between source and observer. This is 
usually called the “thin-lens approximation”: all the 
action of deflection is thought to take place at a 
single distance. This approach is valid only if the 
relative velocities of lens, source, and observer are 
small compared to the velocity of light (v «& c) and 
if the Newtonian potential is small (|| < c*). These 
two assumptions are justified in all astronomical 
cases of interest. The size of a galaxy, for example, 
is of order 50 kpc, even a cluster of galaxies is not 
much larger than 1 Mpc. This “lens thickness" is 
small compared to the typical distances of the order 
of few Gpc between observer and lens or lens and 
background quasar/galaxy, respectively. We assume 
that the underlying spacetime is well described by a 
perturbed Friedmann-Robertson-Walker metric: 


ds? = (1 c] cde 2) (1 — A Jae [3] 


C 


A detailed description of optics in curved spacetimes 
and a derivation of the lens equation from Einstein's 
field equations can be found in Schneider et al. 
(1992, chapters 3 and 4). 


Lens Equation 


The basic setup for such a simplified gravitational 
lens scenario involving a point source and a point 
lens is displayed in Figure 1. The three ingredients in 
such a lensing situation are the source S, the lens L, 
and the observer O. Light rays emitted from the 
source are deflected by the lens. For a point-like 


ee ee G M 


Figure 1 The relation between the various angles and 
distances involved in the lensing setup can be derived for the 
case & < 1 and formulated in the lens equation [6]. 


lens, there will always be (at least) two images S, 
and S, of the source. With external shear — due to 
the tidal field of objects outside but near the light 
bundles — there can be more images. The observer 
sees the images in directions corresponding to the 
tangents to the real incoming light paths. 

In Figure 1, the corresponding angles and angular 
diameter distances Dj, Ds, Djs are indicated. (In 
cosmology, the various methods to define distance 
diverge. The relevant distances for gravitational 
lensing are the angular diameter distances.) In the 
thin-lens approximation, the hyperbolic paths are 
approximated by their asymptotes. In the circular- 
symmetric case, the deflection angle is given as 


a 4GM(€) 1 
一 一 一 一 一 4 
where M(£) is the mass inside a radius £. In this 
depiction, the origin is chosen at the observer. From 
the diagram, it can be seen that the following 
relation holds: 


0Ds = BDs + aD ts [5] 


(for 0, 8,à < 1; this condition is fulfilled in practi- 
cally all astrophysically relevant situations). With 
the definition of the reduced deflection angle as 
a(@) = (Dis/Ds)à(8), this can be expressed as 


8 — 8 — o(0) 6 


This relation between the positions of images and 
source can easily be derived for a nonsymmetric 
mass distribution as well. In that case, all angles are 


vector valued. The two-dimensional lens equation 
then reads 


B —6 — a(0) 7] 


Einstein Radius 


For a point lens of mass M, the deflection angle is 
given by eqn [4]. Plugging this deflection angle into 
eqn [6] and using the relation £ — D18 (cf. Figure 1), 
one obtains 


Dis 4GM 
0) 20— 
B( ) DT Ds c? 0 [8] 
For the special case in which the source lies exactly 
behind the lens (G=0), due to the symmetry, a ring- 
like image occurs whose angular radius is called 
Einstein radius OF: 


4GM Dis i9 
c? DT Ds 


i; = 


The Einstein radius defines the angular scale for a 
lens situation. For a massive galaxy with a mass of 
M — 10'* M, at a redshift of z1 =0.5 and a source at 
redshift zs — 2.0 (we used here H = 50km s ! Mpc! 
as the value of the Hubble constant and an Einstein- 
de Sitter universe), the Einstein radius is 


mE x 


OF s 1.8 102 M, 


arcsec [10] 


(note that for cosmological distances, in general, 
Dis Æ Ds — Dj!). For a galactic microlensing sce- 
nario in which stars in the disk of the Milky Way act 
as lenses for stars close to its center, the scale 
defined by the Einstein radius is 


| M 
OE & 0.5 pz eae [11] 


An application and some illustrations of the point 
lens case can be found in the section on galactic 
microlensing. 


Critical Surface Mass Density 


In the more general case of a three-dimensional mass 
distribution of an extended lens, the density p(r) can 
be projected along the line of sight onto the lens 
plane to obtain the two-dimensional surface mass 
density distribution X(£) as 


Ds 
2 的 = 人 pn [1 


Here r is a three-dimensional vector in space, and & 
is a two-dimensional vector in the lens plane. The 
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two-dimensional deflection angle @ is then given as 
the sum over all mass elements in the lens plane: 


: 4G [(£—4)X(8) 
a(S) = J EEE d 
一 5| 
For a finite circle with constant surface mass density 
X, the deflection angle can be written as 


_ Dis AG Une 


3 [13] 


With €= D18 this simplifies to 
= ATGE D1 Dis 
a(0) = E Dx 0 [15] 


With the definition of the critical surface mass 
density Xi; as 
2 
C Ds 
> ENS E 16 
ds 47G DT Dis | | 
the deflection angle for a such a mass distribution 
can be expressed as 
3 
a(0) = 0 17 
d) T [17] 
The critical surface mass density can be visualized as 
the lens mass M “smeared out” over the area of the 
Einstein ring: Xi; = M/ (Ren), where Rg =6gDL. 
The value of the critical surface mass density is 
roughly X4 % 0.8g cm? for lens and source red- 
shifts of zi =0.5 and zs = 2.0, respectively. For an 
arbitrary mass distribution, the condition X > X 
at any point is sufficient to produce multiple images. 


Image Positions and Magnifications 


The lens equation [6] can be re-formulated in the 
case of a single-point lens: 


=0 -Ë 18 

p=0-% 18 

Solving this for the image positions 0, one finds that 

an isolated point source always produces two 

images of a background source. The positions of 
the images are given by the two solutions: 


01.2 =5 (0+ Vm ean) [19] 


The magnification of an image is defined by the 
ratio between the solid angles of the image and the 
source, since the surface brightness is conserved. 
Hence, the magnification ju is given as 


0 dé 


H= Bd [20] 
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In the symmetric case above, the image magnifica- 
tion can be written as (by using the lens equation) 


xi 

RI u^ +2 1 
res ee 1E A a. gm 
TS | k ) rr P 


Here we defined u as the “impact parameter,” the 
angular separation between lens and source in units 
of the Einstein radius: 4 = 3/6. The magnification 
of one image (the one inside the Einstein radius) is 
negative. This means it has negative parity: it is 
mirror-inverted. For 5 — 0 the magnification 
diverges. In the limit of geometrical optics, the 
Einstein ring of a point source has infinite magnifi- 
cation! (Due to the fact that physical objects have a 
finite size, and also because at some limit wave 
optics has to be applied, in reality the magnification 
stays finite.) The sum of the absolute values of the 
two image magnifications is the measurable total 
magnification j: 


u^ +2 
uvu +4 


Note that this value is (always) larger than 1! (This 
does not violate energy conservation, since this is the 
magnification relative to an “empty” universe and 
not relative to a “smoothed out” universe. This issue 
is treated in detail in Schneider et al. (1992, chapter 
4.5).) The “sum” of the two image magnifications is 


unity: 


p |m| + |p| = [22] 


Hı ua = 1 [23] 


(Non)Singular Isothermal Sphere 


A popular model for galaxy lenses is the singular 
isothermal sphere with a three-dimensional density 
distribution of 


c2 1 

2rG r2 

where o, is the one-dimensional velocity dispersion. 

Projecting the matter on a plane, one obtains the 
circularly symmetric surface mass distribution 

ov 1 

NE) = — 25 

6-35: 25] 

With M(£) — i^ X(£')2m«£'d£' plugged into eqn [4], 

one obtains the deflection angle for an isothermal 


sphere, which is a constant (i.e., independent of the 
impact parameter £): 


p(r) — [24] 


g2 
a(€) = 4n [26] 


In practical units for the velocity dispersion of a 
galaxy, this can be expressed as 


? 
" 2 
a(€) = 115 (et) arcsec 27 
(©) 200kms! 27 
Two generalizations of this isothermal model are 
commonly used: models with finite cores are more 
realistic for (spiral) galaxies. In this case, the 
deflection angle is modified to (core radius £.): 
; c, € 
a(&) = E ET) 
Ere 
Furthermore, a realistic galaxy lens usually is not 
perfectly symmetric but is slightly elliptical. Depend- 
ing on whether one wants an elliptical mass 
distribution or an elliptical potential, various form- 
alisms are in use. 


i28] 


Lens Mapping 


In the vicinity of an arbitrary point, the lens 
mapping as shown in eqn [7] can be described by 


its Jacobian matrix A: 


7 OB = TN Oa;(@) À " 
aT (5 00, i (5 00,00; 


Here we made use of the fact that the deflection 
angle can be expressed as the gradient of an effective 


two-dimensional scalar potential i: V» — a, where 
_ Dis 2 5] 90 


and (r) is the Newtonian potential of the lens. The 
determinant of the Jacobian .A is the inverse of the 
magnification: 


1 
Defining 
| O^ 
vi = 95,59, [32] 


the Laplacian of the effective potential wW is twice the 
convergence: 


Yi + 755» = 2& — tr Vi; [33] 


With the definitions of the components of the 
external shear y, 


?1(8) 


and 


=1 (11 — Y2) = y(0)cos(2u(0) ^ [34] 


^2(0) = %12 = War = 7(0) sin[29(8)] [35] 


(where the angle w reflects the direction of the shear- 
inducing tidal force relative to the coordinate 
system), the Jacobian matrix can be written as 


A-( T mi 
=y lest 
1 0 COS 24p 
-a-a( )2(5 
0 1 sin 2y 


The magnification can now be expressed as a 
function of the local convergence « and the local 
shear ^: 


sin 2y ) 36) 
—cos2yp 


u= (det A)’ = [37] 


(1-4) — 


Locations at which det A — 0 have formally infinite 
magnification. They are called “critical curves" in 
the lens plane. The corresponding locations in the 
source plane are the “caustics.” For spherically 
symmetric mass distributions, the critical curves are 
circles. For a point lens, the caustic degenerates into 
a point. For elliptical lenses or spherically symmetric 
lenses plus external shear, the caustics consist of 
cusps and folds. 


Time Delay and Fermat’s Theorem 


The deflection angle is the gradient of an effective 
lensing potential v. Hence, the lens equation can be 
rewritten as 


(0 —B) — Vi —0 [38] 


or 


V, (3 = = v) =0 [39] 


The term in brackets appears as well in the physical 
time delay function for gravitationally lensed 
images: 


T(8. B) = Tgeom 十 Tgrav 


u 1+ ZL DT Ds 
i C Dis 


(36-8 v) wo 


This time delay surface is a function of the image 
geometry (0,B), the gravitational potential v», and 
the distances Dj, Ds, and Djs. The first part — the 
geometrical time delay Tgeom — reflects the extra path 
length compared to the direct line between observer 
and source. The second part — the gravitational time 
delay Terav — is the retardation due to gravitational 
potential of the lensing mass (known and confirmed 
as Shapiro delay in the solar system). From eqns [39] 
and [40], it follows that the gravitationally lensed 
images appear at locations that correspond to 
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extrema in the light travel time, which reflects 
Fermat’s principle in gravitational-lensing optics. 

The (angular-diameter) distances that appear in 
eqn [40] depend on the value of the Hubble 
constant. Therefore, it is possible to determine the 
latter by measuring the time delay between different 
images and using a good model for the effective 
gravitational potential «^ of the lens. 


Lensing Phenomena 


Strong lensing phenomena involve multiple images, 
caustics, critical lines, usually a significant magnifi- 
cation, and large distortions if extended sources are 
involved. Below we discuss the most frequent strong 
lensing phenomena. 


Galactic Microlensing 


The conceptually simplest strong lensing scenario is 
a foreground star acting as a lens on a background 
star. Since stars in the Milky Way move relative to 
each other, this can be observed as a time-variable 
situation: due to the relative motion between 
observer, lensing star, and source star, the projected 
impact parameter between lens and source changes 
with time and produces a time-dependent magnifi- 
cation. If the impact parameter is smaller than an 
Einstein radius (u <1), then the magnification is 
Limin > 1.34 (cf. eqn [22]). 

For an extended source, a sequence image 
configurations with decreasing impact parameter 
is illustrated in Figure 2 for five instants of time. 
The separation of the two images is of order-2 
Einstein radii when they are of comparable 
magnification, which corresponds to only about 
1 marcsec in a realistic situation in the Milky 
Way. Hence, the two images cannot be resolved 
individually; we can only observe the combined 
brightness of the image pair. This is illustrated in 
Figures 3 and 4, which show the relative tracks 
and the respective light curves for five values of 
the minimum impact parameter Umin. 


Figure 2 Five snapshots of a gravitational lens situation with a 
point lens and an extended source: from left to right the 
alignment between lens and source gets better and better, until 
it is perfect in the rightmost panel. This results in the image of an 
“Einstein ring.” 
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Figure 3 Five relative tracks between background star and 
foreground lens (indicated as the central star) parametrized by 
the impact parameter ug. The dashed line indicates the 
Einstein ring for the lens. 


Am (in magnitudes) 


Time (in tg = Rg/V,) 


Figure 4 Five microlensing light curves for the tracks indicated 
in Figure 3, parametrized by the impact parameter wmin. The 
vertical axis is the magnification in astronomical magnitudes 
relative to the unlensed case, the horizontal axis displays the 
time in "normalized" units. 


Quantitatively, the total magnification p= || + 
|u| of the two images (cf. eqn [22]) entirely depends 
on the impact parameter z(t) — r(1)/ Rg between the 
lensed star and the lensing object, measured in the 
lens plane (here RE is the Einstein radius of the lens, 
i.e., the radius at which a circular image appears for 
perfect alignment between source, lens, and obser- 
ver, cf. Figure 2, rightmost panel): 


2 
pu)=—— 
(e) (uy +4) 


The timescale of such a “microlensing event" is 
defined as the time it takes for the source to cross 


the Einstein radius. With realistic values for dis- 
tances and relative velocity, this can be expressed as 


_Re İM | Dy 
zal Vj ) 
ual E e 42 
Ds A200 kms! a 


(here v, is the (relative) transverse velocity of the 
lens; we applied the simple relation Dis = Ds — Di, 
which is valid here). 

Note that from eqn [42] it is obvious that it is not 
possible to determine the mass of the lens from one 
individual microlensing event. The duration of an 
event is determined by three unknown parameters: 
the mass of the lens m, the transverse velocity v, 
and the distance of the lens Di (assuming we know 
the distance to the source). It is impossible to 
disentangle these for individual events. Only with a 
model for the spatial and velocity distribution of the 
lensing stars in the Milky Way, one can obtain 
approximate information about the masses of the 
lensing objects. 

In 1986, Bohdan Paczynski suggested to use this 
microlensing method as an observational test for 
potential dark matter candidates in the halo of the 
Milky Way. If the dark matter is in the form of 
astrophysical objects (such as brown dwarfs, neu- 
tron stars, black holes, sometimes called *MACHO" 
for MAssive Compact Halo Object), then they 
should occasionally act as lenses on stars in the 
neighboring galaxy Large Magellanic Cloud. It 
turned out that too few of such microlensing events 
were observed, in order to explain the dark matter 
this way. 

However, this method produced more than 2000 
microlensing events by ordinary stars in the direc- 
tion to the center of the Milky Way. Two of these 
events provide convincing evidence for a planet 
accompanying the lensing star. It is likely that 
gravitational microlensing will provide a statistically 
very valuable sample of extrasolar planets, because 
in contrast to most other methods these planets are 
pre-selected by their host stars. Furthermore, micro- 
lensing is sensitive to masses as low as a few Earth 
masses. 


Multiply-Imaged Quasars 


The first gravitationally lensed double quasar was 
discovered in 1979: two images of the same quasar, 
separated by about 6 arcsec. This led to the field of 
gravitational lensing as an observational science. By 
now, more than 120 multiply imaged quasars are 
known, mostly double and quadruple images. They 


span image separations from 0.3 arcsec to almost 30 
arcsec. 

Gravitationally lensed quasar systems are studied 
individually in great detail to get a better under- 
standing of both lens and source. The lens systems 
are analyzed statistically as well, in order to get 
information about the population of lenses (and 
quasars) in the universe, their distribution in 
distance (i.e., cosmic time) and mass, and hence 
about the cosmological model. 


Time delay and Hubble constant As stated above, 
the signals from a gravitational lens system reach us 
with a certain “time delay" At, so that the measured 
fluxes as functions of time, J4(1) and Ip(t), can be 
described as: Ip(£) — const. x IA(t + At). Any intrin- 
sic fluctuation of the quasar shows up in both 
images, in general with an overall offset in apparent 
magnitude and an offset in time. 

Q0957 + 561 is the first lens system in which the 
time delay was firmly established: 


Atoo9s7+561 = (417 + 3) days [43] 


With a model of the lens system, the time delay 
can be used to determine the Hubble constant. (This 
can be seen very simply: imagine a certain lens 
situation like the one displayed in Figure 1. If now 
all length scales are reduced by a factor of 2 and at 
the same time all masses are reduced by a factor of 
2, then for an observer, the angular configuration in 
the sky would appear exactly identical. But the total 
length of the light path is reduced by a factor of 2. 
Now, since the time delay between the two paths is 
the same fraction of the total lengths in either 
scenario, a measurement of this fractional length 
allows us to determine the total length, and hence 
the Hubble constant, the constant of proportionality 
between distance and redshift.) The resulting value 
of Ho IS 


Ho = (67 £13) km $ ! Mpc! (44) 


where the uncertainty comprises the 95% confi- 
dence level. To date, about a dozen quasar lens 
systems have measured time delays. The derived 
values of the Hubble constant are “lowish,” if we 
assume the best astrophysical motivated lens 
models. 


Quasar Microlensing 


Light bundles from “lensed” quasars are split by 
intervening galaxies. Usually the quasar light bundle 
passes through the galaxy and/or the galaxy halo. 
Galaxies consist at least partly of stars, and galaxy 
halos consist possibly of compact objects as well. 
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Figure 5 "Microimages": the top left panel shows an assumed 
"unlensed" source profile of a quasar. The other three panels 
illustrate the microimage configuration as it would be produced 
by stellar objects in the foreground. The surface mass density of 
the lenses is 20% (top right), 50% (bottom left), and 80% 
(bottom right) of the critical density (cf. eqn [16]). 


Each of these stars (or other compact objects like 
black holes, brown dwarfs, or planets) acts as a 
“compact lens” or “microlens” and produces at least 
one additional microimage of the source. This 
means the “macroimage” consists of many “micro- 
images” (Figure 5). But because the image splitting is 
proportional to the square root of the lens mass, 
these microimages are only of order a microarcse- 
cond apart and cannot be resolved. Various aspects 
of microlensing have been addressed after the first 
double quasar had been discovered. 

The microlenses produce a complicated two- 
dimensional magnification distribution in the source 
plane. It consists of many caustics, locations that 
correspond to formally infinitely high magnification. 
An example for such a magnification pattern is shown 
in Figure 6. It is determined with the parameters of 
image A of the quadruple quasar Q2237 + 0305 


Figure 6 Magnification pattern in the source plane, produced 
by a dense field of stars in the lensing galaxy. The gray scale 
reflects the magnification as a function of the quasar position. 
Light curves taken along the straight tracks are shown in 
Figure 7. The microlensing parameters were chosen according 
to a model for image A of the quadruple quasar Q2237 + 
0305: «= 0.36, ^; = 0.44. 
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(surface mass density «=0.36; external shear 
"y = 0.44). Gray scale indicates the magnification. 

Due to the relative motion between observer, lens, 
and source, the quasar changes its position relative 
to this arrangement of caustics, that is, the apparent 
brightness of the quasar changes with time. A one- 
dimensional cut through such a magnification 
pattern, convolved with a source profile of the 
quasar, results in a microlensed light curve. Exam- 
ples for microlensed light curves taken along the 
straight lines in Figure 6 can be seen in Figure 7 for 
two different quasar sizes. 

In particular when the quasar track crosses a 
caustic (the sharp lines in Figure 6 for which the 
magnification formally is infinite, because the deter- 
minant of the Jacobian disappears, cf. eqn [31]), a 
pair of highly magnified microimages appears 
newly or merges and disappears. Such a microlen- 
sing event can easily be detected as a strong peak in 
the light curve of the quasar image. 

Microlens-induced fluctuations in the observed 
brightness of quasars contain information both 
about the light-emitting source (size of continuum 
region or broad line region of the quasar, brightness 
profile of quasar) and about the lensing objects 
(masses, density, transverse velocity). Hence, from a 
comparison between observed and simulated quasar 
microlensing, one can draw conclusions about the 
density and mass scale of the microlenses. So far the 
“best” example of a microlensed quasar is the 
quadruple quasar Q2237 + 0305. 
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Figure 7 Microlensing light curve for the straight lines in 
Figure 6. The solid and dashed lines indicate relatively small and 
large quasar sizes. The time axis is in units of Einstein radii 
divided by unit velocity. 


Einstein Rings 


If a point source lies exactly behind a point lens, a ring- 
like image occurs. Theorists had recognized early on 
that such a symmetric lensing arrangement would result 
in a ring image, the so-called *Einstein ring." There are 
two necessary requirements for the observability of 
Einstein rings: the mass distribution of the lens needs to 
be approximately axially symmetric, as seen from the 
observer, and the source must lie exactly on top of the 
resulting degenerate pointlike caustic. Such a geometric 
arrangement is highly unlikely for pointlike sources. But 
astrophysical sources in the real universe have a finite 
extent, and it is enough if a part of the source covers the 
point caustic (or the complete astroid caustic in the case 
of a not quite axially symmetric mass distribution) in 
order to produce such an annular image. 

In 1988, the first example of an “Einstein ring" 
was discovered. With high-resolution radio observa- 
tions, the extended radio source MG1131 + 0456 
turned out to be a ring with a diameter of about 
1.75 arcsec. The source was identified as a radio 
lobe at a redshift of zs = 1.13, whereas the lens is a 
galaxy at z, = 0.85. By now more than a dozen cases 
have been found that qualify as Einstein rings. Their 
diameters vary between 0.33 and about 2 arcsec. 


Giant Luminous Arcs and Arclets 


Fritz Zwicky had pointed out the potential use of 
galaxies and galaxy clusters as gravitational lenses in 
the 1930s. With background galaxies as sources, the 
apparent lensing consequences for them would be 
far more dramatic than for quasars: galaxies should 
be heavily deformed once they are strongly lensed. 
Rich clusters of galaxies at redshifts beyond z ~ 0.2 
with masses of order 10'*M. are very effective 
lenses if they are centrally concentrated. Their 
Einstein radii are of the order of 20 arcsec. 

In 1986, the following gravitational lensing phe- 
nomenon was discovered: magnified, distorted, and 
strongly elongated images of background galaxies 
which happen to lie behind foreground clusters of 
galaxies, the so-called giant luminous arcs. The giant 
arcs can be exploited in two ways, as is typical for 
many lens phenomena. Firstly, they provide us with 
strongly magnified galaxies at (very) high redshifts. 
These galaxies would be too faint to be detected or 
analyzed in their unlensed state. Hence, with the 
lensing boost, we can study these galaxies in their early 
evolutionary stages, possibly as infant or protoga- 
laxies, relatively shortly after the big bang. The other 
practical application of the arcs is to take them as tools 
to study the potential and mass distribution of the 
lensing galaxy cluster. In the simplest model of a 
spherically symmetric mass distribution for the cluster, 


giant arcs form very close to the critical curve, which 
marks the Einstein ring. So with the redshifts of the 
cluster and the arc, it is easy to determine a rough 
estimate of the lensing mass by just determining the 
radius of curvature and interpreting it as the Einstein 
radius of the lens system. 


Weak Lensing/Statistical Lensing/Cosmic Shear 


In contrast to the phenomena that were mentioned 
here, “weak lensing” deals with effects of light 
deflection that cannot be measured individually, but 
rather in a statistical way only. No caustics, critical 
lines, or multiple images are involved. As was 
discussed above, “strong lensing” — usually defined as 
the regime that involves multiple images, high magni- 
fications, and caustics in the source plane - is a rare 
phenomenon. Weak lensing on the other hand is much 
more common. In principle, weak lensing acts along 
each line of sight in the universe, since each light 
bundle’s path is affected by matter inhomogeneities 
along or near its path. It is just a matter of how 
accurately we can measure. In recent years, many 
teams started impressive and ambitious observational 
programs to determine the slight distortion of tens of 
thousands of background galaxies by foreground 
galaxy clusters and/or by the large-scale structure in 
the universe, the so-called cosmic shear. It is beyond 
the scope of this article to discuss these applications of 
weak gravitational lensing. The interested reader is 
referred to the “Further reading” section, in particular 
to Bartelmann and Schneider (2001). 
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See also: Cosmology: Mathematical Aspects; General 
Relativity: Experimental Tests; General Relativity: 
Overview; Newtonian Limit of General Relativity; 
Singularity and Bifurcation Theory. 
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Introduction 


Let a number, N, of particles interact classically 
through Newton's laws of motion and Newton's 
inverse-square law of gravitation. Then the equa- 
tions of motion are 

= Fi T} 


BF 1] 


FREENEESS 
j=1, j#i Iri — rji 


where r; is the position vector of the ;th particle 
relative to some inertial frame, G is the universal 
constant of gravitation, and 77; is the mass of the ;th 
particle. These equations provide an approximate 


mathematical model with numerous applications in 
astrophysics, including the motion of the Moon and 
other bodies in the solar system (planets, asteroids, 
comets, and meteor particles); stars in stellar systems 
ranging from binary and other multiple stars to star 
clusters and galaxies; and the motion of dark-matter 
particles in cosmology. For N—1 and N —2, the 
equations can be solved analytically. The case N —3 
provides one of the richest of all unsolved dynamical 
problems — the general three-body problem. For 
problems dominated by one massive body, as in 
many planetary problems, approximate methods 
based on perturbation expansions have been devel- 
oped. In stellar dynamics, astrophysicists have 
developed numerous numerical and theoretical 
approaches to the problem for larger values of N, 
including treatments based on the Boltzmann equa- 
tion and the Fokker-Planck equation; such N-body 
systems can also be modeled as self-gravitating 
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gases, and thermodynamic insights underpin much 
of our qualitative understanding. 


Few-Body Problems 
The Two-Body Problem 


For N =2, the relative motion of the two bodies can 
be reduced to the force-free motion of the center of 
mass and the problem of the relative motion. If 
r= rı 一 r2, then 


7 = -G(m +m) -5 [2 
Ir| 

often called the Kepler problem. It represents motion 
of a particle of unit mass under a central inverse-square 
force of attraction. Energy and angular momentum are 
constant, and the motion takes place in a plane passing 
through the origin. Using plane polar coordinates (r, @) 
in this plane, the equations for the energy and angular 
momentum reduce to 


L^ G m 
E-S(Pen)- etm [3] 
L- 7 [4] 


(Note that these are not the energy and angular 
momentum of the two-body problem, even in the 
barycentric frame of the center of mass; E and L must 
be multiplied by the reduced mass mımz/(mı + m2).) 
Using eqns [3] and [4], the problem is reduced to 
quadratures. The solution shows that the motion is on 
a conic section (ellipse, circle, straight line, parabola, 
or hyperbola), with the origin at one focus. 


This reduction depends on the existence of integrals 


of the equations of motion, and these in turn depend 
on symmetries of the underlying Lagrangian or 
Hamiltonian. Indeed, eqns [1] yield ten first integrals: 
six yield the rectilinear motion of the center of mass, 
three the total angular momentum, and one the energy. 
Furthermore, eqn [2] may be transformed, via the 
Kustaanheimo-Stiefel (KS) transformation, to a four- 
dimensional simple harmonic oscillator. This reveals 
further symmetries, corresponding to further invar- 
iants: the three components of the Lenz vector. 
Another manifestation of the abundance of symme- 
tries of the Kepler problem is the fact that there exist 
action-angle variables in which the Hamiltonian 
depends only on one action, that is, H — H(L). 
Another application of the KS transformation is one 
that has practical importance: it removes the singular- 
ity of (i.e., regularizes) the Kepler problem at r — 0, 
which is troublesome numerically. 

To illustrate the character of the KS transforma- 
tion, we consider briefly the planar case, which can 


be handled with a complex variable obeying the 
equation of motion ž= —z/|z| (after scaling 
eqn (2)). By introducing the Levi-Civita transforma- 
tion z—Z^ and Sundman's transformation of the 
time, that is, dt/dr=|z|, the equation of motion 
transforms to Z — bZ/2, where h — |z|^ /2 — 1/|z] is 
the constant of energy. The KS transformation is a 
very similar exercise using quaternions. 


The Restricted Three-Body Problem 


The simplest three-body problem is given by the 
motion of a test particle in the gravitational field of 
two particles, of positive mass 77, m2, in circular 
Keplerian motion. This is called the circular 
restricted three-body problem, and the two massive 
particles are referred to as primaries. In a rotating 
frame of reference, with origin at the center of mass 
of these two particles, which are at rest at positions 
ri, r2, the equation of motion is 


P--F20xr--Ox (Q xr) 
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where r is the position of the massless particle and 2 
is the angular velocity of the frame. 

This problem has three degrees of freedom but only 
one known integral: it is the Hamiltonian in the 
rotating frame, and equivalent to the Jacobi integral, J. 
One consequence is that Liouville's theorem is not 
applicable, and more elaborate arguments are required 
to decide its integrability. Certainly, no general 
analytical solution is known. 

There are five equilibrium solutions, discovered 
by Euler and Lagrange (see Figure 1). They lie at 
critical points of the effective potential in the 


Figure 1 The equilibrium solutions of the circular restricted 
three-body problem. A rotating frame of reference is chosen in 
which two particles are at rest on the x-axis. The massless 
particle is at equilibrium at each of the five points shown. Five 
similar configurations exist for the general three-body problem; 
these are the "central" configurations. 


rotating frame, and demarcate possible regions of 
motion. 

Throughout the twentieth century, much numerical 
effort was used in finding and classifying periodic 
orbits, and in determining their stability and bifurca- 
tions. For example, there are families of periodic orbits 
close to each primary; these are perturbed Kepler 
orbits, and are referred to as satellite motions. Other 
important families are the series of Liapounov orbits 
starting at the equilibrium points. 

Some variants of the restricted three-body pro- 
blem include the following: 


1. The elliptic restricted three-body problem, in 
which the primaries move on an elliptic Kepler- 
ian orbit; in suitable coordinates the equation of 
motion closely resembles eqn [5], except for a 
factor on the right side which depends explicitly 
on the independent variable (transformed time); 
this system has no first integral. 

2. Sitnikov’s problem, which is a special case of the 
elliptic problem, in which mı =m, and the 
motion of the massless particle is confined to 
the axis of symmetry of the Keplerian motion; 
this is still nonintegrable, but simple enough to 
allow extensive analysis of such fundamental 
issues as integrability and stochasticity. 

3. Hill’s problem, which is a scaled version suitable 
for examining motions close to one primary; its 
importance in applications began with studies of 
the motion of the moon, and it remains vital for 
understanding the motion of asteroids. 


The General Three-Body Problem 


Exact solutions When all three particles have 
nonzero masses, the equations of motion become 


Mif; = —NV;W 


where the potential energy is 
W-—-.G m;m; 

rr Tl. 

Then the exact solutions of Euler and Lagrange 
survive in the form of homographic solutions. In 
these solutions, the configuration remains geometri- 
cally similar, but may rotate and/or pulsate in the 
same way as in the two-body problem. 

Let us represent the position vector r; in the 
planar three-body problem by the complex number 
zi. Then, it is easy to see that we have a solution of 
the form z;(t) = z(1)zoj, provided that 

Z C P 
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and 
m;Czo; = Vi W (Zo1, Z02, 203) 


for some constant C. Thus, z(t) may take the form of 
any solution of the Kepler problem, while the 
complex numbers zo; must correspond to what is 
called a central configuration. These are in fact 
critical points of the scale-free function WI, where 
I (the “moment of inertia of the system") is given by 
I= 35 mir?; and C= —W/I. 

` The existence of other important classes of 
periodic solutions can be proved analytically, even 
though it is not possible to express the solution in 
closed form. Examples include hierarchical three- 
body systems, in which two masses mı, m exhibit 
nearly elliptic relative motion, while a third mass 
orbits the barycenter of mı and m in another nearly 
elliptic orbit. In the mathematical literature, this is 
referred to as motion of elliptic—elliptic type. More 
surprisingly, the existence of a periodic solution in 
which the three bodies travel in succession along the 
same path, shaped like a figure 8 (cf. Figure 2), was 
established by Chenciner and Montgomery (2000), 
following its independent discovery by Moore using 
numerical methods. Another interesting periodic 
motion that was discovered numerically, by 
Schubart, is a solution of the collinear three-body 
problem, and so collisions are inevitable. In this 
motion, the body in the middle alternately encoun- 
ters the other two bodies. 


Figure 2 A rare example of a scattering encounter between 
two binaries (which approach from upper right and lower left) 
which leads to a permanently bound triple system describing the 
“figure-8” periodic orbit. A fourth body escapes at the bottom. 
Note the differing scales on the two axes. (Reproduced with 
permission from Heggie DC (2000) A new outcome of binary- 
binary scattering. Monthly Notices of the Royal Astronomical 
Society 318(4): L61-L63; © Blackwell Publishing Ltd.) 
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Singularities As Schubart’s solution illustrates, 
two-body encounters can occur in the three-body 
problem. Such singularities can be regularized just as 
in the pure two-body problem. Triple collisions 
cannot be regularized in general, and this singularity 
has been studied by the technique of *blowup." This 
has been worked out most thoroughly in the collinear 
three-body problem, which has only two degrees of 
freedom. The general idea is to transform to two 
variables, of which one (denoted by r, say) deter- 
mines the scale of the system, while the other (s) 
determines the configuration (e.g., the ratio of 
separations of the three masses). By scaling the 
corresponding velocities and the time, one obtains a 
system of three equations of motion for s and the 
two velocities which are perfectly regular in the limit 
r — 0. In this limit, the energy integral restricts the 
solutions of the system to a manifold (called the 
collision manifold). Exactly the same manifold 
results for zero-energy solutions, which permits a 
simple visualization. Equilibria on the collision 
manifold correspond to the Lagrangian collinear 
solutions in which the system either expands to 
infinity or contracts to a three-body collision. 


Qualitative ideas Reference has already been made 
to motion of elliptic-elliptic type. In a motion of 
elliptic-hyperbolic type, there is again an “inner” 
pair of bodies describing nearly Keplerian motion, 
while the relative motion of the third body is nearly 
hyperbolic. In applications, this is referred to as a 
kind of scattering encounter between a binary and a 
third body. When the encounter is sufficiently close, 
it is possible for one member of the binary to be 
exchanged with the third body. One of the major 
historical themes of the general three-body problem 
is the classification of connections between these 
different types of asymptotic motion. It is possible to 
show, for instance, that the measure of initial 
conditions of hyperbolic-elliptic type leading asymp- 
totically to elliptic-elliptic motion (or any other type 
of permanently bound motion) is zero. Much of the 
study of such problems has been carried out 
numerically. A 

There are many ways in which the stability of 
three-body motions may be approached. One exam- 
ple is furnished by the central configurations already 
referred to. They can be used to establish sufficient 
conditions for ensuring that exchange is impossible, 
and similar conclusions. 

A powerful tool for qualitative study of three- 
body motions is Lagrange’s identity, which is now 
thought of as the reduction to three bodies of the 
virial theorem. Let the size of the system be 


characterized by the “moment of inertia” I. Then it 
is easy to show that 


dl 
4g = 4T +2W 


where T, W are, respectively, the kinetic and 
potential energies of the system. Usually, the bary- 
centric frame is adopted. Since E— T + V is con- 
stant and T > 0, it follows that the system is not 
bounded for all t > 0 unless E < 0. 


Perturbation theory The question of the integr- 
ability of the general three-body problem has 
stimulated much research, including the famous 
study by Poincaré which established the nonexis- 
tence of integrals beyond the ten classical ones. 
Poincaré’s work was an important landmark in the 
application to the three-body problem of perturba- 
tion methods. If one mass dominates, that is, m1 > 
m and mı > m3, then the motion of m and m3 
relative to mı is a mildly perturbed two-body 
motion, unless m and m3 are close together. Then 
it is beneficial to describe the motion of 77; relative 
to mı by the parameters of Keplerian motion. These 
would be constant in the absence of m3, and vary 
slowly because of the perturbation by m3. This was 
the idea behind Lagrange's very general method of 
variation of parameters for solving systems of 
differential equations. Numerous methods were 
developed for the iterative solution of the resulting 
equations. In this way, the solution of such a three- 
body problem could be represented as a type of 
trigonometric series in which the arguments are the 
angle variables describing the two approximate 
Keplerian motions. These were of immense value in 
solving problems of celestial mechanics, that is, the 
study of the motions of planets, their satellites, 
comets, and asteroids. 

A major step forward was the introduction of 
Hamiltonian methods. A three-body problem of the 
type considered here has a Hamiltonian of the form 


rs Hi(L4) + H>(L3) +R 


where H;, i=1, 2, are the Hamiltonians describing 
the interaction between m; and m1, and R is the 
“disturbing function.” It depends on all the vari- 
ables, but is small compared with the H;. Now 
perturbation theory reduces to the task of perform- 
ing canonical transformations which simplify R as 
much as possible. 

Poincaré’s major contribution in this area was to 
show that the series solutions produced by perturba- 
tion methods are not, in general, convergent, but 


asymptotic. Thus, they were of practical rather than 
theoretical value. For example, nothing could be 
proved about the stability of the solar system using 
perturbation methods. It took the further analytic 
development of KAM theory to rescue this aspect of 
perturbation theory. This theory can be used to 
show that, provided that two of the three masses are 
sufficiently small, then for almost all initial condi- 
tions the motions remain close to Keplerian for all 
time. Unfortunately, now it is the practical aspect of 
the theory which is missing; though we have 
introduced this topic in the context of the three- 
body problem, it is extensible to any N-body system 
with N — 1 small masses in nearly Keplerian motion 
about m1, but to be applicable to the solar system 
the masses of the planets would have to be 
ridiculously small. 


Numerical methods Numerical integrations of the 
three-body problem were first carried out near the 
beginning of the twentieth century, and are now 
commonplace. For typical scattering events, or other 
short-lived solutions, there is usually little need to go 
beyond common Runge-Kutta methods, provided 
that automatic step-size control is adopted. When 
close two-body approaches occur, some regulariza- 
tion based on the KS transformation is often 
exploited. In cases of prolonged elliptic-elliptic 
motion, an analytic approximation based on Kepler- 
ian motion may be adequate. Otherwise (as in 
problems of planetary motion, where the evolution 
takes place on an extremely long timescale), meth- 
ods of very high order are often used. Symplectic 
methods, which have been developed in the context 
of Hamiltonian problems, are increasingly adopted 
for problems of this kind, as their long-term error 
behavior is generally much superior to that of 
methods which ignore the geometrical properties of 
the equations of motion. 


Four- and Five-Body Problems 


Many of the foregoing remarks, on central config- 
urations, numerical methods, KAM theory, etc., 
apply equally to few-body problems with N > 3. 
Of special interest from a theoretical point of view is 
the occurrence of a new kind of singularity, in which 
motions become unbounded in finite time. For 
N —4, the known examples also require two-body 
collisions, but noncollision orbits exhibiting finite- 
time blowup are known for N — 5. 

One of the practical (or, at least, astronomical) 
applications is again to scattering encounters, this 
time involving the approach of two binaries on a 
hyperbolic relative orbit. Numerical results show 
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that a wide variety of outcomes is possible, includ- 
ing even the creation of the figure-8 periodic orbit of 
the three-body problem, while a fourth body escapes 
(Figure 2). 


Many-Body Problems 


Many of the concepts already introduced, such as 
the virial theorem, apply equally well to the many- 
body classical gravitational problem. This section 
refers mainly to the new features which arise when 
N is not small. In particular, statistical descriptions 
become central. The applications also have a 
different emphasis, moving from problems of plane- 
tary dynamics (celestial mechanics) to those of 
stellar dynamics. Typically, N lies in the range 
107-10". 


Evolution of the Distribution Function 


The most useful statistical description is obtained if 
the correlations we neglect and focus on the one- 
particle distribution function f(r, v, t), which can be 
interpreted as the number density at time ¢ at the 
point in phase space corresponding to position r and 
velocity v. Several processes contribute to the 
evolution of f. 


Collective effects When the effects of near neigh- 
bors are neglected, the dynamics is described by the 
Vlasov—Poisson system 

af. af delr.t) af 
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V^$ = trGm | f(r v, t)d?v [7] 


where ó is the gravitational potential and m is the 
mass of each body. Obvious extensions are neces- 
sary if not all bodies have the same mass. 

Solutions of eqn [6] may be found by the method 
of characteristics, which is most useful in cases 
where the equation of motion r= —V¢ is integr- 
able, for example, in stationary, spherical potentials. 
An example is the solution 


f 5 |E” [8] 


where E is the specific energy of a body, that is, 
E — v^ /2 + à. This satisfies eqn [6] provided that 由 
Is static. Equation [7] is satisfied provided that ó 
satisfies a case of the Lane-Emden equation, which 
is easy to solve in this case. 

The solution just referred to is an example of an 
equilibrium solution. In an equilibrium solution, the 
virial theorem takes the form 4T + 2W —0, where 
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T, W are appropriate mean-field approximations for 
the kinetic and potential energy, respectively. It 
follows that E= —T, where E=T + V is the total 
energy. An increase in E causes a decrease in T, 
which implies that a self-gravitating N-body system 
exhibits a negative specific heat. 

There is little to choose between one equilibrium 
solution and another, except for their stability. In 
such an equilibrium, the bodies orbit within the 
potential on a timescale of the crossing time, which 
is conventionally defined to be 

GC M5/2 
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The most important evolutionary phenomenon of 
collisionless dynamics is violent relaxation. If fis not 
time independent then @ is time dependent in 
general. Also, from the equation of motion of one 
body, E varies according to dE/dt =0@/0t, and so 
energy is exchanged between bodies, which leads to 
an evolution of the distribution of energies. This 
process is known as violent relaxation. 

Two other relaxation processes are of importance: 


fer 


1. Relaxation is possible on each energy hypersur- 
face, even in a static potential, if the potential is 
nonintegrable. 

2. The range of collective phenomena becomes 
remarkably rich if the system exhibits ordered 
motions, as in rotating systems. Then an impor- 
tant role is played by resonant motions, espe- 
cially resonances of low order. The 
corresponding theory lies at the basis of the 
theory of spiral structure in galaxies, for instance. 


Collisional effects The approximations of colli- 
sionless stellar dynamics suppress two important 
processes: 


1. The exponential divergence of stellar orbits, 
which takes place on a timescale of order ter. 
Even in an integrable potential, therefore, f 
evolves on each energy hypersurface. 

2. Two-body relaxation. It operates on a timescale of 
order (N/In N)t4, where N is the number of 
particles. Although this two-body relaxation time- 
scale, t,, is much longer than any other timescale 
we have considered, this process leads to evolution 
of f (E), and it dominates the long-term evolution 
of large N-body systems. It is usually modeled by 
adding a collision term of Fokker-Planck type on 
the right-hand side of eqn [6]. 


In this case, the only equilibrium solutions 
in a steady potential are those in which f(E) x 
exp(—GE), where 8 is a constant. Then eqn [7] 


becomes Liouville's equation, and for the case of 
spherical symmetry the relevant solutions are those 
corresponding to the isothermal sphere. 


Collisional Equilibrium 


We consider the collisional evolution of an N-body 
system further in a later subsection and here develop 
fundamental ideas about the isothermal model. The 
isothermal model has infinite mass, and much has 
been learned by considering a model confined within 
an adiabatic boundary or enclosure. There is a series 
of such models, characterized by a single dimension- 
less parameter, which can be taken to be the ratio 
between the central density and the density at the 
boundary, po/pe (Figure 3). 

These models are extrema of the Boltzmann 
entropy S= —k | f In fd^r, where k is the Boltzmann 
constant, and the integration is taken over all 
available phase space. Their stability may be 
determined by evaluating the second variation of S. 
It is found that it is negative definite, so that S is a 
local maximum and the configuration is stable, only 
if po/pe < 709 approximately. A physical explana- 
tion for this is the following. In the limit when 
po / pe = 1, the self-gravity (which causes the spatial 
inhomogeneity) is weak, and the system behaves like 
an ordinary perfect gas. When po/pe > 1, however, 
the system is highly inhomogeneous, consisting of a 
core of low mass and high density surrounded by an 
extensive halo of high mass and low density. 
Consider a transfer of energy from the deep interior 
to the envelope. In the envelope, which is restrained 
by the enclosure, the additional energy causes a rise 
in temperature, but this is small, because of the very 
large mass of the halo. Extraction of energy from 
around the core, however, causes the bodies there to 
sink and accelerate, and, because of the negative 
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Figure 3 The density profile of the nonsingular isothermal 
model, with conventional scalings. 


specific heat of a self-gravitating system, they gain 
more kinetic energy than they lost in the original 
transfer. Now the system is hotter in the core than in 
the halo, and the transfer of energy from the interior 
to the exterior is self-sustaining, in a gravothermal 
runaway. The isothermal model with large density 
contrast is therefore unstable. 

The negative specific heat, and the lack of an 
equilibrium which maximizes the entropy, are two 
examples of the anomalous thermodynamic beha- 
vior of the self-gravitating N-body problem. They 
are related to the long-range nature of the gravita- 
tional interaction, the importance of boundary 
terms, and the nonextensivity of the energy. Another 
consequence is the inequivalence of canonical and 
microcanonical ensembles. 


Numerical Methods 


The foregoing considerations are difficult to extend 
to systems without a boundary, although they are a 
vital guide to the behavior even in this case. Our 
knowledge of such systems is due largely to 
numerical experiments, which fall into several 
classes: 


1. Direct N-body calculations. These minimize the 
number of simplifying assumptions, but are 
expensive. Special-purpose hardware is readily 
available, which greatly accelerates the necessary 
calculations. Great care has to be taken in the 
treatment of few-body configurations, which 
otherwise consume almost all resources. 

2. Hierarchical methods, including tree methods, 
which shorten the calculation of forces by 
grouping distant masses. They are mostly used 
for collisionless problems. 

3. Grid-based methods, which are used for colli- 
sionless problems. 

4. Fokker-Planck methods, which usually require a 
theoretical knowledge of the statistical effects of 
two-, three- and four-body interactions. Other- 
wise they can be very flexible, especially in the 
form of Monte Carlo codes. 

5. Gas codes. The behavior of a self-gravitating 
system is simulated surprisingly well by modeling 
it as a self-gravitating perfect gas, rather like a 
star. 


Collisional Evolution 


Consider an isolated N-body system, which is 
supposed initially to be given by a spherically 
symmetric equilibrium solution of eqns [6] and [7], 
such as eqn [8]. The temperature decreases with 
increasing radius, and a gravothermal runaway 
causes the "collapse" of the core, which reaches 
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Figure 4 Gravothermal oscillations in an N-body system with 
N — 65 536. The central density is plotted as a function of time in 
units such that £,—2V2. (Source: Baumgardt H, Hut P, 
and Makino J, with permission.) 


extremely high density in finite time. (This collapse 
takes place on the long two-body relaxation time- 
scale, and so it is not the rapid collapse, on a free- 
fall timescale, which the name rather suggests.) 

At sufficiently high densities, the timescale of 
three-body reactions becomes competitive. These 
create bound pairs, the excess energy being removed 
by a third body. From the point of view of the one- 
particle distribution function, f, these reactions are 
exothermic, causing an expansion and cooling of the 
high-density central regions. This temperature inver- 
sion drives the gravothermal runaway in reverse, 
and the core expands, until contact with the cool 
envelope of the system restores a normal tempera- 
ture profile. Core collapse resumes once more, and 
leads to a chaotic sequence of expansions and 
contractions, called gravothermal oscillations 
(Figure 4). 

The monotonic addition of energy during the 
collapsed phases causes a secular expansion of the 
system, and a general increase in all timescales. In 
each relaxation time, a small fraction of the masses 
escape, and eventually (it is thought) the system 
consists of a dispersing collection of mutually 
unbound single masses, binaries, and (presumably) 
stable higher-order systems. 

It is very remarkable that the long-term fate of 
the largest self-gravitating N-body system appears 
to be intimately linked with the three-body 
problem. 
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Chaos and Attractors; Dynamical Systems and 
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Mechanics; Stability Theory and KAM. 


Further Reading 


Arnol’d VI, Kozlov VV, and Neishtadt AI (1993) Mathematical 
Aspects of Classical and Celestial Mecbanics, 2nd edn. Berlin: 
Springer. 

Barrow-Green J (1997) Poincaré and the Three Body Problem. 
Providence: American Mathematical Society. 

Binney J and Tremaine S (1988) Galactic Dynamics. Princeton: 
Princeton University Press. 


Gravitational Waves 


G González and J Pullin, Louisiana State University, 
Baton Rouge, LA, USA 


© 2006 Elsevier Ltd. All rights reserved. 


In elementary physics presentations, one learns about 
electricity and magnetism, and also about gravity. 
There appear striking similarities between Newton’s 
law of gravitational attraction and Coulomb’s law of 
attraction between charges. There are also obvious 
differences, the most immediate one being that in 
gravitation all masses are positive and always attract 
each other, whereas in electromagnetism charges may 
attract or repel, depending on their signs. We also 
know today that Newton’s theory of gravity is not 
considered an entirely correct description of the 
gravitational field, particularly when fields are time 
dependent and intense. The currently accepted theory 
of gravity is Einstein’s theory of general relativity. 

The similarity between electromagnetism and 
gravitation also holds to a certain extent when the 
fields depend on time. This is usually not discussed 
in elementary treatments since a full description of 
time-dependent gravitational fields requires the use 
of general relativity. It is true, however, that if the 
fields are weak, there exist several similarities 
between gravitation and electromagnetism. In parti- 
cular, one can have waves in the gravitational field 
that are able to carry energy from a source to a 
receptor. 

If one assumes that the metric of spacetime is 
close to the flat Minkowski metric 7,,, that is, 
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Euv = Nv + buy with |b,,| << 1 in Cartesian coordi- 
nates, the Einstein equations of general relativity, 
expanding to linear order in h,,,, become 


0 =Rv i +TwR 
二 (0,0, b7 + 0,0, b7 TY DO 


uv 
(- OP yw = 1,,0,0, b" T Thy Th) [1] 


These do not look like wave equations. However, 
if one chooses “harmonic coordinates,” Ox“ — 0, 
where [1 is the d’Alembertian constructed with the 
full metric and then linearized, the vacuum Einstein 
equations become 


Lh pu — 0 i2] 


where O is the d'Alembertian computed in the flat 
Minkowski metric. 

Just as in electromagnetism the motion of charges 
produces waves, the motion of masses produces 
waves in the gravitational field. In the above wave 
equations, one would have nonzero right-hand sides 
if masses were present. In electromagnetism, the 
conservation of charge implies that the lowest order 
of "structure" a source must have to produce 
electromagnetic waves is that of a time-dependent 
dipole. In the gravitational field, the conservation of 
momentum implies that the lowest multipolar order 
of a source of gravitational waves must be a 
quadrupole. Moreover, gravity is a weaker force 
than electromagnetism when one considers usually 
available situations. One can exert forces of the 
orders of fractions of Newton with electromagnetic 


charges easily collected in tabletop experiments. To 
produce similar amounts of gravitational force, one 
needs large quantities of mass. This last fact, 
coupled with the quadrupolar nature of the sources 
of gravitational waves, makes their production quite 
challenging in experimental terms. The luminosity of 
a gravitational wave source is given by the cele- 
brated Einstein quadrupole formula, 


where G is Newton’s gravitational constant, c is the 
speed of light, and I; is the third-order time 
derivative of the traceless part of the quadrupole 
mass moment of the source. 

Gravity is, however, a dominant force if one 
considers the universe at large (say, at least 
planetary) scales. There one would expect gravita- 
tional waves to play some role in the dynamics of 
the systems. In such systems, the presence of 
gravitational waves has indeed been experimentally 
confirmed. We know of a system of two pulsars in 
mutual orbit, PSR1913 + 16, whose orbit has been 
tracked with enough accuracy via radioastronomy 
to make the influence of gravitational waves 
observable. The motion of the pulsars makes the 
system an emitter of gravitational waves. Since the 
waves carry away energy, the orbit of the system 
decreases in radius and the period of oscillation 
increases. The system has now been tracked for over 
20 years, and the prediction of the emitted amount 
of energy in gravitational waves due to general 
relativity has been confirmed with a very significant 
degree of accuracy. Penrose was the first to notice 
that if one considers how accurately Newton's 
theory plus the corrections due to general relativity 
predict the positions of the pulsars in their orbit, this 
is in fact the most accurately verified physical 
prediction ever. 

Technically, even the existence of gravitational 
waves at a conceptual mathematical level, was an 
open problem for many years. Since the correct 
description of the waves is through the general 
theory of relativity, a “gravitational wave" should 
really be viewed as a “ripple in spacetime.” Disen- 
tangling if such ripples are a true physical effect or a 
time-dependent coordinate transformation that pro- 
pagates — to use the words of Eddington — “with the 
speed of thought” took quite a bit of technical 
development within the general theory of relativity. 
It was only in the 1960s that a clear enough 
conceptual picture was developed to determine that 
gravitational waves were indeed a true physical 
phenomenon akin to electromagnetic waves. And in 
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particular that one can unambiguously characterize 
them as transporting energy, momentum, and 
angular momentum from a source to an observer. 

Gravitational waves are as difficult to detect as 
they are to produce. Since all masses fall in the same 
way in a gravitational field, one needs to couple to 
the gradients of the field to detect gravitational 
waves, which diminishes the efficiency. Attempting 
to produce gravitational waves via mechanical 
means in the lab (e.g., by rotating a bar of metal) 
produces too little luminosity, and in addition, the 
relatively low frequency implies that the wave zone 
is far away, which further decreases the chances for 
detection. Up to date, no one has succeeded in 
producing a Hertz-like experiment for gravitational 
waves and the jury is still out on the issue if future 
technologies (e.g., the use of superconductors to 
produce waves of gigahertz frequency) will ever 
allow such an experiment. 

Efforts to attempt to detect gravitational waves 
produced by astrophysical phenomena started in the 
1960s with pioneering work by Weber. The initially 
proposed technology for detection was the construc- 
tion of large (~1 ton) resonant bars. The idea was 
to use sensitive technology to measure the resonance 
of the bar as gravitational waves of astrophysical 
origin impact on it. Gravitational waves manifest 
themselves as a stretching and contraction of 
lengths. The contraction or stretching is propor- 
tional to the length of the object considered and is 
therefore characterized by a dimensionless number, 
the “strain” AL/L usually called “h.” Conservative 
current estimates of possible astrophysical sources 
state that on Earth one should not expect strains 
larger than 107? for events that repeat more 
frequently than a few times every year. Detectors 
with bar technology are approaching their funda- 
mental quantum limits with strains that appear to be 
too large for detection to be ensured. This led to the 
proposal of a new technology, the use of Michelson- 
type interferometers to detect the waves. Currently, 
several interferometric detectors are being built in 
the US, Europe, Japan, and Australia that expect to 
achieve enough sensitivity for detection within a few 
years. Contrary to the bars, which are quintessen- 
tially narrow-band detectors (most bars operate 
~900Hz with a bandwidth of ~10 Hz), interfero- 
metric detectors are broadband. Current detectors 
have a sensitivity curve limited by various sources of 
noise that make them suitable for detection within 
the 10 Hz-1 kHz band. The broadband nature of the 
detectors opens several opportunities for the use of 
data analysis techniques that can allow the detection 
of gravitational waves that have strains even lower 
than the noise of the detectors. Moreover, several of 
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the candidate events “evolve” in frequency as they 
emit gravitational waves (in the case of the binary 
pulsar, for instance, the frequency “sweeps up” as the 
system loses energy), and such evolution could be 
monitored with interferometric detectors. This 
would allow several insights into the physics of the 
observed systems. 

An important limitation of any type of detector 
based on Earth is that the seismic noise increases 
quite significantly below 10 Hz. Even if seismic 
isolation allowed sensitivities below 10 Hz, gravity 
gradients due to Earth's seismic motion and due to 
clouds would limit ground-based detectors to 1 Hz 
and above. The frequency at which a system emits 
gravitational waves is inversely proportional to the 
system's mass (a simple way to see this is to realize 
that larger systems move proportionally slower to 
their size). However, larger systems generically have 
more mass and therefore consequently emit larger 
amounts of energy in gravitational waves. This 
suggests that setting up detectors in space, free of 
the constraints of seismic noise, would offer sig- 
nificant promise in detecting gravitational waves. 
Currently, there is a proposal for a space-borne 
gravitational wave detector consisting of three 
satellites in a solar orbit that trails that of Earth. 
Lasers would be sent between the satellites to track 
their relative positions, which will be separated by 
5 million kilometers. Such a detector would be 
sensitive in frequencies of 104-10 Hz. In such a 
frequency band, one expects that compact objects 
plunging into supermassive black holes and other 
sources will be readily available. Detection of 
gravitational waves on Earth is considered marginal, 
in the sense that conservative current estimates 
cannot guarantee that there will be enough events 
to make the detection successful at significant event 
rates. Conversely, for the detectors in space, detec- 
tion should be guaranteed at high event rates. 

Possible sources of gravitational waves to be 
detected by the Earth-based interferometric detec- 
tors are: 


1. Binary systems of compact objects. As the system 
orbits, it emits gravitational waves, which makes 
the orbit shrink in size and the orbiting period 
shorter with the objects eventually merging 
together. Potential systems include black hole 
binaries, neutron star binaries and mixed black 
hole/neutron star binaries. As the system sweeps 
up in orbital speed towards the merger, so does 
the frequency of the gravitational waves emitted. 
For binaries of neutron stars, which usually have 
masses slightly larger than the mass of the Sun, 
the last few minutes of the binary inspiral will be 


detectable by the current generation of gravita- 
tional wave detectors, up to a distance of several 
mega-parsecs for the initial detectors, increasing 
to a few hundreds of mega-parsecs for improve- 
ments planned for the next few years. For black 
hole binaries, since the masses can be larger, one 
expects larger signal-to-noise ratio for the same 
distance or to be able to detect at larger 
distances. 


2. Spinning neutron stars that develop *mountains" 


or other irregularities in the surface would 
produce gravitational waves of small amplitude 
but of a very regular periodic nature. This makes 
them prime candidates for data analysis techni- 
ques that could exhibit the presence of the wave 
even though it is weaker than the background 
noise of the interferometers. Integration periods 
of several months may be needed for detection, 
depending on the size of the asymmetries in the 
neutron stars. 

3. Supernovas or other violent events are obviously 
possible sources of gravitational waves. How- 
ever, the quadrupole nature of the waves requires 
the events to be asymmetric in order to produce 
gravitational waves. Current numerical models of 
supernovas are not accurate enough to predict in 
a clear way the level of asymmetry to make 
reliable predictions of how frequently and at 
what intensity could these types of sources be 
detected. 

4. The primordial background of gravitational 
waves produced in the big bang is not expected 
to be detectable by the Earth-based detectors. 
The precise amplitude of the background is 
unknown, depending on details of cosmological 
models. The detectors are likely to be able to 
constrain some of the models that predict large 
amplitudes for the gravitational wave 
background. 


For the space-based detectors, the situation is 
more favorable, since there exist sources of gravita- 
tional waves that are guaranteed to be detected. 
Potential sources of gravitational waves are: 


1. Merger of the supermassive black holes at the 
centers of two galaxies. Given the large amounts 
of mass involved, they would be easily detected 
and very precise measurements of the system’s 
parameters and of various general relativistic 
behaviors could be possible. Such systems should 
be detectable all across the universe, although it 
is not expected that such systems form for 
redshifts larger than 30. 

2. Inspiral of compact objects into the supermassive 
black holes at the centers of galaxies (neutron 


stars, white dwarfs, solar-sized black holes). 
These processes will allow the usage of gravita- 
tional waves to map precisely the gravitational 
field of the supermassive object. 

3. White-dwarf binaries and low-mass X-ray bin- 
aries. There exist about a dozen such systems 
optically observable with gravitational wave 
frequencies above 0.1 mHz that the space-based 
detectors should be able to detect. There is likely 
to be a large population of other systems that are 
also detectable and are not optically visible. In 
fact, there may be so many of these sources that 
time resolution would be impossible, and they 
would form a random background. 

4. Collapse of supermassive stars. The formation 
mechanism for the supermassive black holes in 
the centers of galaxies is still uncertain. One 
possibility is that they stem from the collapse of 
supermassive stars, and in that case a potentially 
significant emission of gravitational waves could 
take place. 

5. Primordial background of gravitational waves. 
Unfortunately, the abundance of white-dwarf 
binaries as a source is expected to cloud the ability 
of the space detectors to observe primordial 
gravitational waves in an important portion of the 
spectrum of the instrument, although it appears 
possible at low frequencies, where it could compete 
with the bounds set by pulsar timing. 


The current Earth-based gravitational wave pro- 
jects include the LIGO project in the US, funded by 
the National Science Foundation and jointly oper- 
ated by Caltech and MIT and a consortium of 
institutions known as the LIGO Science Collabora- 
tion. LIGO consists of two 4km long Fabry-Perot 
recycled Michelson interferometers, one in Hanford, 
WA, and one in Livingston, LA. In Europe, the 
GEO600 project is a 600m dual-recycled inter- 
ferometer near Hanover in Germany and the Virgo 
project is a 3km interferometer near Pisa in Italy 
operated by a French-Italian consortium with a 
similar optical configuration as LIGO. TAMA300 is 
a 300 m interferometer in Japan also with the same 
configuration as LIGO. When all these detectors are 
in operation, sources seen in coincidence could be 
localized by triangulation. TAMA is now operating 
close to design sensitivity, GEO600 and LIGO are 
likely to operate at design sensitivity in 2006, with 
VIRGO following close behind. The space-based 
interferometer project is called the LISA project and 
is planned as a joint NASA/ESA project. ESA has 
approved a launching date for 2015, but it is 
plausible that the mission could be launched at an 
earlier date. 
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A direct detection of gravitational waves would be a 
breakthrough in experimental science, as well as a 
confirmation of the dynamic nature of gravity in 
general relativity. Once the detection of gravitational 
waves becomes a routine matter, one can imagine a 
revolution in astronomy as one uses gravitational 
waves to “see” the universe. Since they are so hard to 
produce and interfere with, gravitational waves 
become an excellent type of “light” to look at the 
universe with. Gravitational waves will be produced by 
important concentrations of mass, correlating well 
with "interesting" astronomical processes, and is not 
expected to be affected by the presence of dust or other 
interfering objects that could easily obscure electro- 
magnetic waves. In addition to this, one has several 
“standard candles” for gravitational waves (e.g., most 
neutron stars have masses that differ by a few percent 
from 1.4 solar masses). This could allow, for instance, 
to determine with a high degree of accuracy the Hubble 
constant. Gravitational waves will also provide insight 
into the nuclear equation of state that holds in the 
interior of compact objects like neutron stars. Contrary 
to ordinary electromagnetic radiation, which 
“decoupled” from matter only when the universe 
became cool enough after the big bang, gravitational 
waves could be used to probe the universe further into 
the past. The detection could also prove that gravita- 
tional waves travel at the speed of light, a prediction of 
general relativity and other theories. 

An interesting observation is that most astrophysical 
objects that are quite visible in the electromagnetic 
spectrum are unlikely to be visible in terms of 
gravitational waves, and vice versa. This makes the 
information we will gather from gravitational wave 
astronomy complementary to what we learn from 
optical (electromagnetic) astronomy. Moreover, it 
should be noted that wavelengths of electromagnetic 
waves are typically very small compared to the size of 
the astronomical objects they depict. This is due to the 
fact that the waves are really not produced by the 
objects themselves but by atoms on the surface of the 
objects or in regions nearby, usually very hot and in 
gaseous form. In contrast, gravitational waves are 
produced by the bulk matter of astronomical objects 
and their wavelengths are expected to be long as 
compared to the objects that produce them. They are 
more akin to a sound than to light in this respect, 
another reason to suspect that the information we will 
get from them is unlike any information obtained 
electromagnetically. 

Gravitational waves are likely to bring great 
surprises. Every time a new window has been 
opened on the universe — for instance, the use of 
radio waves — our view of the universe has been 
revolutionized. Given how differently they operate 
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at a detailed level with respect to radio waves, the 
surprises from gravitational waves used as tools to 
view the universe are potentially even greater. 


See also: Asymptotic Structure and Conformal 
Infinity; Computational Methods in General Relativity: 
The Theory; General Relativity: Experimental Tests; 
General Relativity: Overview. 
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Introduction 


Probability distributions coming from random matrix 
theory (RMT), RMT laws, occur in different contexts, 
notably in quantum physics and in number theory. 
RMT laws are also seen in certain local random growth 
models and related problems in discrete probability, 
random permutations, exclusion processes, and ran- 
dom tilings (dimer models). In these models limit laws 
for height/shape fluctuations are given by limit laws 
from RMT, in particular the largest eigenvalue or 
Tracy-Widom distributions. These models belong to 
the Kardar-Parisi-Zhang (KPZ) universality class. 
Models in this class have two universal exponents, 1/ 
3 describing the interface fluctuations and 2/3 describ- 
ing the correlations in the transversal direction. By a 
local random growth model, we mean a model where 
the random growth mechanism is local in that it does 
not depend on the global geometry as in diffusion 
limited aggregation (DLA). Typically there is also some 
smoothing mechanism. The connection with RMT can 
only be established for special exactly solvable models. 
Below we discuss a basic model based on a last-passage 
percolation problem, which translates into a poly- 
nuclear growth (PNG) process. Other models that can 
be treated are in a sense variations of this model. Point 
processes with determinantal correlation functions 
play a central role in RMT and in the analysis of the 
basic model and we start by discussing these. The basic 
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model has several different interpretations that will be 
outlined. Another basic tool, which can be formulated 
in different ways, is the Robinson-Schensted-Knuth 
(RSK) correspondence well known in combinatorics. 
One approach is in terms of nonintersecting paths 
which translates into a multilayer PNG process. Limit 
theorems can be formulated for the height above a 
fixed location, and also for the whole height function in 
terms of the Airy process which extends the Hermitian 
Tracy-Widom distribution F2. It is expected that 
several results should generalize to a broader class of 
models. There is a natural universality problem of 
extending the validity of the RMT laws. 


Determinantal Processes 


Point processes with determinantal correlation func- 
tions play an important role in the exactly solvable 
models. We consider probability measures on 
A", A CR, of the form 


Z; det(ġi(x;)); j1 det (W(x); j d'u(x) [A 


which can be thought of as describing random points 
in A at positions x1,...,x,. Here, w is a reference 
measure on A, for example, Lebesgue or counting 
measure, Zn a normalization constant, and ó;, b; given 
functions. A measure of this form has determinantal 
correlation functions in the sense that the density, with 
n . . 
respect to d ji(y), of particles at y1,..., Ym is 


Ym) = det(Kn (yi, yj)); 1 [2] 


There is an explicit formula for the correlation 
kernel K, in terms of the functions ó;, v. 


PUY» .- 


The eigenvalue measures in the basic random 
matrix ensembles have the form 


7- AO T ws) A" ule 3 
n Fur 


where A,(x)—det(x/!);; ., is Vandermonde's 
determinant, x € A" and x4,...,x, are the eigen- 
values. For the Gaussian unitary ensemble (GUE,,), 
Z.! exp(- tr M?)dM of n x n Hermitian matrices M, 
we have J—-2,A —R,w(x)—exp(-x?) and y the 
Lebesgue measure. For the Laguerre unitary ensem- 
ble (LUE, ,) of complex covariance matrices, M* M, 
where M is an (n+ v) x n-matrix with standard 
complex Gaussian elements, we have w(x)= 
x’e*,y > 0,8=2,A=[0,00) and p the Lebesgue 
measure. The 3=2 case of [3] can be put into the 
form [1] and hence has determinantal correlation 
functions. In this case the correlation kernel can be 
expressed in terms of the normalized orthogonal 
polynomials p(x) with respect to w(x) du(x) on A. 
Because of this when 8=2 the ensemble [3] is 
referred to as an orthogonal polynomial ensemble 
(OPE). The kernel is given by 


n—1 


K,(x.y) = V pe(x)pe(y) (w(x)w(y)) —— [4] 


k=0 
A consequence of [2] is that the probability of 
finding no particle in a set / CA is given by a 
Fredholm determinant, 


P[no particle in J] = det(I — Kn) raq u) [5] 


In particular the distribution function F(£) of the largest 
eigenvalue or rightmost particle Xmax = MaX1<j<n x; 
in an OPE is given by [5] with K, as in [4] and 
J ^ ( oc). 


A Basic Model 


Let (w(i,j))i ez? be independent geometric random 
variables with parameter a;b;, 


P[w(i,j) = k] = (1 — aib;)(ajb;)* [6] 


k>0 and 0 < a;b; < 1. As a limiting case we can 
obtain exponential random variables. Consider the 
last-passage time 


G(M,N)= max >》 w(i,j) [7] 

" (ifyen 
where the maximum is over all up/right paths 7 
from (1, 1) to (M, N), that is, m= {(i1, j1), - - -s (im, jm)] 
with (ik+1sfk+1)— (teste) — (1,0). or (0,1), (41,1) = 
(1, 1) and (in, fm) - (M, N),m — M +N — 1. We can 
also think of this as a zero-temperature directed 
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polymer, by thinking of the w(i,j):s as (minus) 
energies and x as random walk paths. 

As will be explained in some more detail below, 
if the w(i,j):s are exponential with mean 1 and 
M > N, then G(M, N) = Ag, in distribution, M > N, 
where Aga, is the largest eigenvalue in LUEN, M-N. 
Hence, in this case G(M, N) behaves exactly like a 
largest eigenvalue. If the w/(i,j):s are geometric with 
parameter q, then G(M, N) has the same distribution 
as the rightmost particle in an OPE, namely [3], 
with B=2,w(x)= (~ N**)g* and u the counting 
measure on A=N, called the Meixner ensemble. 
Since in this case the relevant orthogonal polynomials 
are discrete the ensemble is referred to as a discrete 
OPE. 

The random variables {G(M, N)] my. Nye za have two 
interpretations related to random growth. It follows 


from [7] that 


G(M, N) 
= max(G(M — 1, N), G(M, N — 1)) 
+ w(M, N) [8] 


This can be thought of as a growth rule. We change 
variables by letting G(M, N) -b(M — N,M4- N - 1) 
and w(M,N) =uw(M—N,M+N — 1) with w(M,N)=0 
if (M,N) ZZ. Then 


h(x,t+ 1) 
= max(h(x — 1,t), b(x,t), b(x + 1,t)) 
+ w(x, t) [9] 


xeZ,teN, h(x, 0) = 0, and w(x, t)=0 if |x| > t or if 
x — t is even. We can extend it to the whole real line 
by letting b(x,t)=h(|x],t). The growth rule [9] is a 
discrete polynuclear growth (PNG) model. Up-steps 
in the interface, x 一 h(x,t) (see the top curve in 
Figure 1), move at unit speed to the left and down- 
steps move at unit speed to the right and they merge 
at collision. On top of this smoothing mechanism, 
we have random deposition given by w(x, t). Look- 
ing at the definition of w, we see that all deposition 
up to time ¢ is on top of a basic layer (—t,t). The 
asymptotic shape will look like a droplet and this 
setting of PNG is called the droplet geometry. We 
see that height fluctuations are directly related to 
fluctuations of G(M, N). 

We get another growth model, the corner growth 
model, somewhat similar to the classical Eden 
growth model, by considering the random shape 
(see Figure 2), 


Q(n) = {(M,N) € Z7;G(M,N)+M+N-—-1<n} 
4. [-1, 0]^ [10] 
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bs b4 bs b» b; 
Figure 1 Multilayer PNG model. 


Figure 2 Corner growth model at time n=5. The crosses are 
the possible growth sites. 


The complement of this set in Z^ has a boundary 
B(n) which we can think of as an interface. By [8] 
and the lack of memory property of the geometric/ 
exponential distribution, the region Q(n) grows by 
adding new squares independently at each corner of 
B(n) with geometric/exponential waiting times (see 
Figure 2). If we look at B(m) in a coordinate system 
with M — N as vertical axis and write a 1 for every 
unit down-step on B(z) and a 0 for every unit up- 
step (see Figure 2), the corner growth dynamics 
translates into the totally asymmetric simple exclu- 
sion process (TASEP), in discrete or continuous 
time, with initial configuration ...1111000... . As 
shown by Jockush, Propp, and Shor, Q(n) also 
occurs in a uniform random domino tiling of a 
region called the Aztec diamond (see Figure 3). The 
shape Q(7), when q= 1/2, has the same law as the 
completely regular (frozen) North Polar Region 
(NPR) in the tiling and hence the boundary 
fluctuations of the NPR are related to the fluctua- 
tions of G(M, N). The NPR in Figure 3 has the same 


Figure 3 Domino tiling of an Aztec diamond of size n=5. 
Dominos marked by dots form the NPR. 


shape as €(5) in Figure 2. This connects the models 
considered here with dimer or tiling problems in 
two-dimensional equilibrium statistical mechanics. 

Consider a (Poissonized) random permutation o 
from Sx, where N is a Poisson(o) random variable. 
Let L(o) denote the length of the longest increasing 
subsequence in o, for example, 0 — 316452 has 
L=3. By thinking of the representation of a 
permutation by its permutation matrix, we see that 
G(N,N) with w(i,j) geometric with parameter 
q=a/N? converges to L(a) in distribution as 
N — oo. We call this limit the Poisson limit. Taking 
this limit in the PNG process yields the Práhofer- 
Spohn continuous time PNG (cont-PNG) model, 
which is similar to the discrete PNG defined above 
but where all steps have unit size and we have 
continuous time dynamics with deposition events 
according to a two-dimensional spacetime Poisson 
process. The study of L(o), and its de-Poissonization 
when N is nonrandom, is known as Ulam's problem 
in combinatorial probability. 


The RSK Correspondence 


The mapping of the last-passage problem [7] into a 
determinantal process is based on the RSK corre- 
spondence. This correspondence maps the integer 
matrix (10(/,7))-;;«M bijectively to a pair of semi- 
standard Young tableaux (P,Q) with common 
shape A, which is a partition A—(A1,22,...) of 
21cij<M Wij). This map has the property that 
G(M,N)=.,, the length of the first row in the 
Young diagram. From the combinatorial definition 
of the Schur polynomials s, it follows that the 
measure [6] on the integer matrix is mapped to a 


probability measure on partitions, the Schur mea- 
sure, given by 


PschurlA] = 7 (..-- am)salbr, -+-+ bm) [11] 
This measure has determinantal correlation functions 
if we think of x;= A; — i as the positions of particles 
in Z. If we use x; A;4-N- i as variables and 
specialize to 41 = :-:- —ay — DE ::: =bn=,/q 
and b;=0 for j > N we get the Meixner ensemble. 
The case of exponential random variables, for 
example the relation to LUE discussed above, is 
obtained from the Meixner ensemble by taking an 
appropriate limit. In the Poisson limit we get the 
Poissonized Plancherel measure, 


: e o/a" 


PPlan [A] = 5 N! [12] 


where Ppan N[A] = (dim A)* /N! if A is a partition of 
N and 0 otherwise. Here dim is the dimension 
of the irreducible representation of Sy labeled by A. 
In the work of Borodin and Olshanski in representa- 
tion theory various measures on partitions with 
determinantal correlation functions occur naturally. 
Also Okounkov and co-workers have used the 
Plancherel and Schur measures in Gromov- Witten 
theory. The correlation kernel for the Plancherel 
measure represented as the point process (x;);.; in Z 
with x; — A; — i has the correlation kernel, called the 
discrete Bessel kernel, 


B^(x,y) = Va(x — y) 
x (Ux (2a). (2a) 
- «a (2vVa)J,(2 va) [13] 
where J„ is the ordinary Bessel function. -The 


random variable L(o) has the same distribution as 
max x; + 1. Hence, by [5], 


P[L(a) < n) = det(I — B^)a,5,..5, — [14] 


The random variable L(o) also gives the height 
above the origin in cont-PNG. 

There is a geometric interpretation of RSK going 
back to Viennot. The pair (P, Q) is represented as a 
family of nonintersecting paths in a directed graph. 
These paths can be obtained by running a multilayer 
version of the PNG process where the size of 
collisions are deposited as growth in lower layers 
which evolve according to the same PNG dynamics. 
The information lost in the collisions is recorded in 
the lower layers. This can be done also for 
(w(i,j));,;.1«, and leads to a multilayer version of 
[9], (5.;(x,t));-,, where h_j;(x,0) = —j, bt, t) — 
—j, and h(x, t) = bo(x, t) is the top path (see Figure 1). 
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The Karlin-McGregor theorem or the Gessel- 
Viennot method say that the weight (probability) 
of a family of nonintersecting paths with fixed initial 
and final positions on a weighted directed acyclic 
graph is given by a determinant. It follows that the 
probability of a certain configuration (5. ;(0, t)};~, is 
given by a product of two determinants and hence 
has the form [1]. In Figure 1, the weights of the 
horizontal line segments will be 1, whereas each unit 
vertical step has weight a; or b; as indicated in the 
figure. This leads to [9] using the Jacobi-Trudi 
formula for the Schur polynomial. 


Limit Theorems 


The existence of a limit shape in a model like [6] 
with w(i,j) independent random variables, and in 
related problems follows by a subadditivity argu- 
ment, although explicit shapes are only known in a 
few cases. The formalism described above makes it 
possible to get more detailed results about the 
fluctuations around the limit shape, like a central- 
limit theorem, but with a non-normal limit law. We 
know that G(M, N) has the same distribution as 
Xmax — N + 1, where xmax is the rightmost particle in 
the Meixner ensemble. This, together with [4], [5], 
and an asymptotic analysis of the Meixner poly- 
nomials, gives 


P[G(M, N) € «(y,q)N --&c(y,q)N'P] E(t [15] 
as N —^ o6, M — oo, M/N — y > 1, where 


2 
omaja Eva , 16 
q 
and 
1/6 
oma) - 4D) — (5 vay ^Q + vary 117) 


1—-q 
The limiting distribution function F; is the Tracy- 
Widom distribution given by 


Fa (€) = det(I — A); 2, [18] 
where 
_ Ai(x)Ai'(y) — Ai'(x)Ai(y) 
UU ow diee es 


is the Airy kernel. It is also the limiting largest 
eigenvalue distribution for GUE», 


" VIn A y -— 2n 
ad ae «&|-FX£) = [20] 


A(x, y) [19] 


The function F can also be expressed in terms of a 
Painlevé II function. The limit theorem [15] 
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translates into a fluctuation result for the height 
function in the corner growth and the PNG models, 
saying that the height fluctuations above a fixed 
location at time ¢ are of order t!/? and given by the 
F,-distribution. Here we see the KPZ exponent 1/3. 

For the length L(o) of the length of a longest 
increasing subsequence in a random permutation or 
the height above the origin in the cont-PNG, [14] 
and asymptotics for Bessel functions yield 


P[(L(o) - 2/a)/a'/* « £ ^ F€) RY 


as Qa — oc. This result was first proved by Baik, 
Deift, and Johansson using a Toeplitz determinant 
formula (Gessel's formula) for the left-hand side of 
[14] and the Deift-Zhou nonlinear steepest descent 
method for oscillatory Riemann-Hilbert problems. 
The above limit theorems can be extended to limit 
theorems for the whole point process rescaled 
around the rightmost point. This results in a 
limiting determinantal point process given by the 
Airy kernel [19]. 


The Airy Process 


From the point of view of the growth processes, for 
example, the PNG process [7], it is natural to consider 
a scaling limit of the whole height function x — h(x, t) 
as t — oo. Looking at the height configuration in the 
multilayer growth process x — {h_;(x,t)};, at differ- 
ent locations x1,...,X;4X;11,..., XM. 1 leads, via the 
Karlin-McGregor or Gessel-Viennot method, to 
probability measures of the form 


All det C (v dj p; "m [22] 


with y? and y™ fixed configurations. Here, in the 
discrete PNG model, $,,,1(x,y) is the transition 
probability (weight) to go from height x to height y 
between positions x, and x,,;. This measure 
generalizes [1] and it also has determinantal correla- 
tion functions. Measures of this form also arise in 
multimatrix models and in Dyson's Brownian 
motion model, £ — M(t,t € R, for Hermitian 
matrices, which is a Gaussian multimatrix model. 
The elements of the time-dependent Hermitian 
matrix M(t) evolve according to independent 
Ornstein-Uhlenbeck processes and we have the 
transition kernel 


Z exp[—tr(M(t) -qM(0)"/(1—4?)] [23] 


where g=exp(—t). This process has GUE as its 
stationary distribution. The Harish-Chandra/Itzykson- 
Zuber integral can be used to show that the joint 
eigenvalue measure for M(t1),..., M(tu.1) has the 


form [22] and hence has determinantal correlation 
functions. The correlation kernel is the extended 
Hermite kernel, 


Kal TXEN) 
> Eo dii; E 4 P (y) e Xe») 
k=1 
iftT>oa 
BE 24 
= > eK) (x) bnl y e 2 1”? 
k——oo 
if Te uO 


with p, the normalized Hermite polynomials, p, = 0 
if k «0. Notice that this reduces to the Hermite 
kernel [4] when 7—2o. This machinery can be used 
to show that the largest eigenvalue process 
t — AU? (t) induced by M(t) converges in the sense 
of finite-dimensional distributions to a limiting 


process, the Airy process, 
(V2nXt, (n7! 1) — 24) /mP Alt) 2S) 


as n— oo. The Airy process A(t), which is a 
stationary process, can be viewed as the top curve 
of a multilayer process t — (.A ;(£));4, A(t) = Ao(t) 
such that the point process {A_j(t,)},<p<m,j>0 has 
determinantal correlation functions with correlation 
kernel 


Air €; 7,8) 
| eX) Ai (E + AJAI (E + A)dÀ 
0 
res [26] 


0 
" / e ATT) Ai (E + AJ Ai (£' +A) dA 


if r<7’ 


the extended Airy kernel, which reduces to the 
ordinary Airy kernel [19] when r=7’. The Airy 
process can be viewed as an extension of the Tracy 一 
Widom distribution F;. For the PNG model above, 
the multilayer process is described by an extended 
kernel, which in the cont-PNG is an extended 
version of the discrete Bessel kernel [13]. In a 
suitable scaling limit, this extended kernel converges 
to the extended Airy kernel. For the PNG process 
[7], this leads to the limit law 


(dN1/3)7! L (aa^ aes 2N — 1) 


2/4 
— DN — A(T) [27] 


as N— oo, where d=(1 一 q) (yq) ^(1 + Jay. 
Notice the exponent 2/3 which is the second KPZ 
exponent. This exponent can also be seen in the 
transversal fluctuations of the maximal paths in [6] 
for G(N,N). These are superdiffusive, they have 
fluctuations of order N?/ around the diagonal, 
compared to N'/* for random walk paths between 
the same points. A fluctuation result like [27] can 
also be proved for the corner growth model and 
hence also for the Aztec diamond. The boundary of 
the NPR suitably rescaled converges to the Airy 
process. 


Variations 


Above we discussed one possible geometry, the 
droplet, for the PNG process. If we start with 
h(x,0) = 0 and allow random depositions along the 
whole line, we get an interface that is macroscopi- 
cally flat, and not curved as in the droplet case. In 
this case, the height fluctuations above a fixed 
location at time t are again of size t'/? and described 
by the Gaussian Orthogonal Ensemble largest 
eigenvalue distribution. This law comes from the 
scaling limit of the rightmost particle in [3] with 
B=1,w(x)= exp(-x?), A— R, and y the Lebesgue 
measure. In this case, the correlation functions are 
not determinants but rather pfaffians. The result for 
flat PNG follows from the Baik-Rains analysis of 
symmetrized last-passage or permutation problems. 
In the PNG model we can also consider an interface 
in equilibrium. This can be put into the last-passage 
percolation picture by suitable boundary conditions, 
different parameters for w(i,j) when i or j equals 1 
or extra Poisson points on the axes in the Poisson 
limit. Results by Baik and Rains show that in the 
cont-PNG in equilibrium the height fluctuations are 
given by a relative of the Tracy-Widom distribution, 
Fo. In these last two cases, the scaling limit of the 
whole height profile is not known. 

The types of results discussed above can only be 
obtained for very special models. However, it is 
expected that many of the results (in particular the 
KPZ exponents 1/3 and 2/3, and also the fluctuation 
laws, including the Airy process) should generalize 
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to many other models. The different interpretations 
of [6] mentioned above suggest different general- 
izations, various local growth models, directed 
polymers, asymmetric exclusion processes, and 
dimer/tiling problems. RMT laws are natural limit 
laws for which the domain of attraction is not 
understood. 


See also: Combinatorics: Overview; Determinantal 
Random Fields; Dimer Problems; Integrable Systems in 
Random Matrix Theory; Random Walks in Random 
Environments; Random Matrix Theory in Physics; 
Random Partitions. 
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Introduction 


The ideal fluid description is one in which viscosity or 
other phenomenological terms are neglected. Thus, as is 


the case for systems governed by Newton's second law 


without dissipation, such fluid descriptions possess 
Lagrangian and Hamiltonian descriptions. In fact, in 
the eighteenth century, Lagrange himself discussed 
what is in essence the action principle for the 
incompressible fluid. The subsequent history of action 
functional and Hamiltonian formulations of the ideal 
fluid is long and convoluted with contributions from 
Clebsch in the nineteenth century, and the likes of L 
Landau and V Arnol’d in the mid-twentieth century. 
In the early 1980s, there was a flurry of activity on 
the noncanonical Poisson bracket formulation, and 
this formulation is the focus of the present treatment, 
which is motivated by the work of the author, D 
Holm, J Marsden, T Ratiu, A Weinstein, and others. 


Noncanonical Hamiltonian Structure 


The traditional arena for Hamiltonian dynamics is the 
cotangent bundle M:= T* Q, the phase space, which is 
naturally a symplectic manifold with a closed non- 
degenerate 2-form. In coordinates, the 2-form is given 
by uc — dq A dp, where q denotes the configuration 
coordinate for the base space manifold Q and p 
denotes the corresponding canonical momenta that 
arise from Legendre (convex) transformation. The 
2-form we provides a natural identification at a point 
z—(q,p) € M of TM with T7.M, and because of 
nondegeneracy its inverse, the cosymplectic form, 
provides the map J.:T7.M — TM. Thus, for a 
Hamiltonian H:M—K we have the Hamiltonian 
system of ordinary differential equations z — J. dH, 
which in canonical coordinates has the familiar form 

d "e OH /Opi, 


pi = —0H/0q' [1] 


with i=1,2,...,N, where N is the number of 
degrees of freedom. 

Hamilton’s equations can also be written in terms 
of the Poisson bracket [f,g] := we(Je df, Je dg), where 
f,g:.M — R are smooth phase-space functions. In 
terms of z=(g,p), Hamilton’s equations are 


* (Y Cx OH [4] 
z = ag-k :如 | [2] 
where the Poisson bracket is 
Of ,ap 08 
= —— fe" = 3 
fel = LIS 3 


with 


meee m) 4| 


Note, repeated indices are to be summed with 
o, 8 —1,2,...,2N. In [4], On is an Nx N matrix 
of zeros and Iyn is the N x N unit matrix. 


Noncanonical Poisson Brackets 


The canonical Poisson bracket description of [2]-[4] 
suggests a generalization, with antecedents to S Lie 
and others, that was termed noncanonical Hamilto- 
nian form in the fluid mechanics context by 
P Morrison and J Greene (1980): 


A system bas noncanonical Hamiltonian form if it 
can be written as £ — |z, H], where the noncanonical 
Poisson bracket | , | is a Lie product for a realization of 
a Lie enveloping algebra on pbase-space functions. 


Recall a Lie enveloping algebra a is a Lie algebra, 
with the usual product [,] that is bilinear, anti- 
symmetric, and satisfies the Jacobi identity, which in 
addition has a product ax a-»5a that satisfies 
the Leibniz identity [fg, 5] — fg, P] + [f, ^]g for all 
f.g,b € a. 

The geometric description of noncanonical Hamil- 
tonian form has evolved into a structure called the 
Poisson manifold, a differential manifold Z 
endowed with the binary bracket operation [, | 
defined on smooth functions, say, f,g:Z-R. 
Poisson manifolds differ from symplectic manifolds 
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because the nondegeneracy condition is removed. In 
coordinates, [,] is given by 


Of "E 
If, g| = gea! = ? [5] 
a, B —1,2,...,M 


where M — dim Z. Note that / need not have the 
form of [3], may depend upon the coordinate 
z, and may have vanishing determinant. Bilinearity, 


[f,gl= —lg,f] for all f,g, and the Jacobi 
identity, [f, [g P1] + Ig, [^ f ]] + [o [f gl] =0, for 
all f,g,b, imply that the cosymplectic matrix 
satisfies pes —] and 

CO =i Bé ap y0"J of" ap 

pam mA mo — (6 


respectively, E & By, 6 = 1, 2,..., M. 

The local structure of Z is elucidated by the Darboux- 
Lie theorem, which states that in a neighborhood of a 
point z € Z, for which rank J = M, there exist coordi- 
nates in which J has the following form: 


On In 0 
Q)— | -Ix On 0 [7] 
0 0 Om-2Nn 


From [7] it is clear that in the right coordinates, the 
system looks like a canonical N-degree-of-freedom 
Hamiltonian system with some extraneous coordi- 
nates, M—2N in fact. Through any point of the 
M-dimensional phase space Z, there exists a local 
foliation by symplectic leaves of dimension 2N. 

A consequence of the degeneracy is that there 
exists a special class of invariants called Casimir 
invariants that is built into the phase space. Since 
the rank of J is 2N, there exist possibly M — 2N 
independent null eigenvectors. A consequence of the 
Darboux-Lie theorem is that the independent null 
eigenvectors exist and, moreover, the null space can 
in fact be spanned by the gradients of the Casimir 
invariants, which satisfy /^^0C'? /0z? —0, where 
à—1,2,3,..., M — 2N. That the Casimir invariants 
are constants e motion follows from 


aco pOH 


Co» = = je az = [8 


Thus, Casimir invariants are constants of motion for any 
Hamiltonian. The symplectic leaves of dimension 2N 
are the intersections of the M — 2N surfaces defined by 
C'? — constant. Dynamics generated by any H that 
begins on a particular symplectic leaf remains there. The 
structure of Poisson manifolds has now been widely 
studied, but we will not pursue this further here. 

Let us turn to infinite-dimensional systems, field 
theories such as those that govern ideal fluids, where 


the governing equations are partial differential 
equations. Although the level of rigor does not 
match that achieved for the finite systems described 
above, formally one can parody most of the steps 
and, consequently, the finite theory provides cogent 
imagery and serves as a beacon for shedding light. In 
infinite dimensions, an analog of [5] is given by 


óF 6G /6F _6G 
(F.G}= f dp M jT = (FI a) 9) 


where F and G are functionals of the functions v (y, t) 
which are functions of j= (11,..., /Hn), independent 
variables of some kind, 6F/6w* denotes the functional 
(variational) derivative, and (,) is a pairing between a 
vector (function) space and its dual. The v/, i — 1,.. . ,z, 
are n field components, and now J is a cosymplectic 
operator. To be noncanonically Hamiltonian requires 
antisymmetry, {F,G}= —{G, F}, and the Jacobi iden- 
tity, (F, (G, H} + (G, (H, F} + (H, (F, G}} = 0, for all 
functionals F, G, and H. Antisymmetry requires 7 to be 
skew-symmetric, that is, (f,.7g) - (Jf, g) = —(g, Jf). 
The Jacobi identity for infinite-dimensional systems has 
a condition analogous to [6]; it can be shown that one 
need only consider variations of 7 when calculating, for 
example, {F,{G,H}}. 


Lie-Poisson Brackets 


As noted in the Introduction, the usual variables of 
fluid mechanics are not a set of canonical variables, 
and, consequently, the Hamiltonian description in 
terms of these variables is noncanonical. There is a 
special general form that the Poisson bracket takes 
for equations that describe media in terms of 
Eulerian-like variables, the so-called Lie-Poisson 
brackets, a special form of noncanonical Poisson 
bracket. Lie-Poisson brackets describe essentially 
every fundamental equation that describes classical 
media. In addition to the equations for the ideal 
fluid, they describe Liouville’s equation for the 
dynamics of the phase-space density of a collection 
of particles, the various hierarchy of kinetic theory, 
the Vlasov equation of plasma physics, and various 
approximations thereof, and magnetized and other 
more complicated fluids. 

Both finite- and infinite-dimensional Lie-Poisson 
brackets are intimately associated with a Lie group ©. 
We use the pairing between a vector space and its 
dual, (,), where the second slot is reserved for 
elements of the Lie algebra g of 6 and the first slot 
for elements of its dual g*. Thus, (,):g* x g— R. In 
terms of the pairing, noncanonical Lie—Poisson 
brackets have the following compact form: 


{F, G} = (x, [E Gx)) [10] 


where we suppose the dynamical variable x € g*,[,] is 
the Lie algebra product, which takes g x g — g, and we 
have introduced the shorthand F,:—óF/óx. The 
quantities F, and G, are, of course, in g. We refer to 
{ ,} asthe “outer” bracket of the realization enveloping 
algebra and [ , | asthe “inner” bracket of the Lie algebra 
g. The binary operator [, |' is defined as follows: 


(x, If. = (Deal's f) [11] 


where evidently x € g*, g,f € g, and[,]':g* x gg". 
The operator |, |', which defines the coadjoint orbit, 
is necessary for obtaining the equations of motion 
from a Lie-Poisson bracket. 

For finite-dimensional systems, the group 6 must be 
a finite-parameter Lie group, the variable «^ corre- 
sponds to w, and the cosymplectic form in coordinates 
is given by Jab = c^,1w;, where the c^, are the structure 
constants for the Lie algebra g, which satisfy 

Cab ES x * 

e d e d e .d [12] 
CabCec + CheCea T CeaCeb 一 
relations that imply [10] satisfies the antisymmetry 
condition and the Jacobi identity. 

For infinite-dimensional systems, the group © 
must be an infinite-parameter Lie group and „tie 
cosymplectic operator has the form Jj= C xis 
where c are structure operators. The meaning of 
these structure operators will be clarified when we 
consider brackets for fluid mechanics. 


The Fluid State 


Fluid mechanics has a long history, and thus it 
comes as no surprise that the fluid state has been 
described in many ways. Because the Hamiltonian 
structure depends on the state variables, some of 
these ways are described below, beginning with 
Lagrangian variable description. 


Lagrangian Variables 


The description of a fluid that is most like that of 
particle mechanics occurs in terms of variables usually 
referred to as Lagrangian variables. This description 
dates to the eighteenth century. The idea behind the use 
of these variables is a simple one: if a fluid is described 
as a continuum collection of fluid particles, also called 
fluid parcels or elements, then its motion is governed by 
an equation that is an infinite-dimensional version of 
Newton's second law and, consequently, as we will see, 
both the Hamiltonian and the Lagrangian descriptions 
are infinite degree-of-freedom generalizations of those 
of ordinary particle mechanics. 


ED -33 
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The position of a fluid element, referred to a fixed 
rectangular coordinate systems, is given by q = q(a, t), 
where q= (q1,q2,q3) and a= (a1,42,43) is a con- 
tinuum label that replaces the index ; of [1]. In 
practice, the label can be any quantity that identifies 
a fluid particle, but it is often taken to be the position 
of the fluid particle at time t=O in rectangular 
coordinates. The quantities g'(a,t) are coordinates 
for the configuration space Q, which is in fact a 
function space because in addition to the three indices 
there is the continuum label a. We assume that a 
varies over a fixed domain, Q C R?, which is 
completely filled with fluid, and that the function 
q:€1 — Q is one-to-one and onto. We will assume that 
as many derivatives of g with respect to a as needed 
exist, but we will not say more about Q; in fact, not 
much is known about the solution function space for 
the 3D fluid equations in Lagrangian variables. Often 
in the Hamiltonian context the functions g = q(a, t) are 
assumed to be diffeomorphisms and their collection is 
referred to as the diffeomorphism group. 

In the sequel several manipulations are needed and 
so we record here some identities for later use. Viewing 
the map a — q at fixed t as a coordinate change, the 
Jacobian matrix Og* / 0a! =: q* has an inverse given by 


Og A, a 
= 1 

二 [13] 
where Aj is the cofactor of q% j and J is its 
deteumalpast. A convenient expression for Aj is 
given by 
l ims 9d! Og 
2 "' Bam Oa" 
where cj, ( = e^ ) is the skew-symmetric tensor 
(density). Evidently, 03/0q'; = A! follows from [13]. 


[14] 


Eulerian Variables 


In the Lagrangian variable description, one picks out 
a particular particle, labeled by a, and keeps track in 
time £ of where it goes. However, in the Eulerian 
variable description, one stays at a spatial observa- 
tion point r=(x1,x%2,x3) EQ and monitors the 
nature of the fluid at r at time t. 

The most important Eulerian variable is the Eulerian 
velocity field v(r, t). This quantity is the velocity of the 
particular fluid element that is located at the spatial 
point r at time t. The label of that particular fluid 


element is given by a —q (r,t), and so 
v(r,t) = q(a, t)|,. -q71(r,t) —qoq ner t) [15] 
where - denotes differentiation with respect to time 


at fixed label a. Attached to a fluid element is a 
certain amount of mass described by a density 
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function po(a). As the fluid moves so that a — q, the 
volume of an infinitesimal region will change, but its 
mass must remain fixed. The statement of local mass 
conservation is pd^r— poda, where da is an initial 
infinitesimal „volume element that maps to dg at 
time £, and d? = 434? a. (When integrating over 2 we 
will replace dg by d?r.) Thus, we obtain 

pola) po 4 
了 Jo 有 1 O (r, t) [16] 
where recall the Jacobian J= det(q',;). Besides the 
density, for the ideal fluid, one attaches an entropy 
per unit mass, s = so(a), to a fluid element, and this 
quantity remains fixed in time. In the Eulerian 
description this gives rise to the entropy field 


! (r, £) [17] 


p(r, t) = 


s(r.t) = so(4), 434,7 $994. 


One could attach other scalar, vector, etc., quan- 
tities to the fluid element, but we will not pursue 
this. In the usual ideal fluid closure only the above 
variables are considered. 

Equations [15]-[17] express the Euler-Lagrange 
map. There is a natural representation of this map in 
terms of the Eulerian density variables, M :— pv, p, 
and 6:— ps, the momentum, mass, and entropy 
densities, respectively, which, as will be seen, are 
variables in which the noncanonical Poisson bracket 
has Lie-Poisson from. 


Other Variables 


Fluid mechanics is rife with variables that have been 
used for its description. For example, Euler, Monge, 
Clebsch, and others introduced potential representa- 
tions, of varying generality, for the Eulerian velocity 
field, an example being 


v(r,t) - a VB + Vo [18] 


where the three components of v are replaced by the 
functions a, 3, and ġ, all of which depend on (r,t). 

Often reduced variables that are tailored to 
specific ideal flows with less generality than those 
described by p,s, and v are considered. Examples 
include incompressible flow with V-v=0, vortex 


dynamics, including contour dynamics and point 


vortex dynamics, flow governed by the 
shallow-water equations, quasigeostrophy, etc. The 
Hamiltonian structure in terms of these reduced 
variables derives from that of the parent model in 
terms of Lagrangian variables. Specific variables 
may embody constraints, and understanding these 
constraints, although tractable, can be a cause of 
confusion. Pursuing this further is beyond the scope 
here. 


Hamilton's Principle for Fluid 


Lagrange, in his famous work of 1788, Mécanique 
Analytique, produced in essence a variational 
principle for incompressible fluid flow in terms of 
Lagrangian variables. The generalization to com- 
pressible flow awaited the discovery of thermody- 
namics, and that is what we describe here. In 
traditional mechanics nomenclature, this variational 
principle is an infinite-dimensional generalization of 
what is known variously as the action principle, the 
principle of least action, or Hamilton's principle, 
whereby one constructs, on physical grounds, a 
Lagrangian function on TQ used in the action 
principle, where Q is the function space of the 
q(a, t). 

Construction of the Lagrangian requires identifi- 
cation of the potential energy, and this requires 
thermodynamics, because potential energy is stored 
in terms of pressure and temperature. A basic 
assumption of the fluid approximation is that of 
local thermodynamic equilibrium. In the energy 
representation of thermodynamics, the extensive 
energy is treated as a function of the entropy and 
the volume. For a fluid, it is convenient to consider 
the energy per unit mass, denoted by U, to be a 
function of the entropy per unit mass, s, and the 
mass density, p, a measure of the volume. The 
intensive quantities, pressure and temperature, are 
given by T —9U/Os and p= p*9U/Op. Choices for 
U produce equations of state. For barotropic or 
isentropic flow, U depends only on p. For an ideal 
monatomic gas U(p,s)— cp? t exp (os), where c,7, 
and o are constants. The function U could also 
depend on additional scalar quantities, such as a 
quantity known as spice that has been considered in 
oceanography. 

Conventional thermodynamic variables can be 
viewed as Eulerian variables with a static velocity 
field. Thus, we write U(p,s), where p and s are 
spatially independent or, if the system has only locally 
relaxed, these variables can be functions of r. For the 
ideal fluid, each fluid element can be viewed as a self- 
contained isentropic thermodynamic system that 
moves with the fluid. Thus, the total fluid potential 
energy functional is given by V[q|= fo d? apo U 
(so, po/J), which is a functional of q that depends 
only upon J and hence only upon 0q/0a. 

The next item required for constructing Hamilton's 
principle is the kinetic energy functional, which is 
given by T[q, d] — fa d'apoq" /2, where d^ :— nd d, 
with the Cartesian metric 7j; := 6;;. This metric and its 
inverse can be used to raise and lower indices. 

The Lagrangian functional is L[g, q]:- T — V, 
where L[g, d] — Jo d'a£(q,4,09/80a) and L is the 


Lagrangian density, in terms of which the action 
functional of Hamilton’s principle is given by 


Sla] = [ dtL|q, å] 


ty 
= /| dt | d'a zm gy = mn [19] 
J to Q 2 


The end conditions for Hamilton’s principle for the 
fluid are the same as those of mechanics, that is, 
6q(a, to) =6q(a,t;)=0. The nonpenetration condi- 
tion, 6g - 4 — 0 on OQ, where £ is a unit normal vector 
is also assumed. Other boundary conditions, such as 
periodic and free boundary conditions, are also 
possibilities. ^ Hamilton's principle amounts to 
óS/óq(a,t) — 0, which, with the end and boundary 


conditions, implies the following equations of motion: 


. 8 (eu 
cal ro L — 
A, Oa! É 5) : aa 


Here we have used 0A! / Oa! — 0, which can be seen 
using [14]. Equation [20] amounts to Newton's 
second law for the ideal fluid, which is made clearer 
by using the following useful identity: 
o l2 

一 一 三 二 4 一 一 21 

Agk 了 “aa 
Alternatively, upon using [13], [20] is sometimes 
written in the form 


,0g 8 /ROU _ 
Podj a i 155 (2 Dp =0 [22] 


The Eulerian variable force law follows from [20] 
upon using [21]: 


pođi + 


Ov 
(5. Tv: vv) = —Vp [23] 


where v=v(r,t). The remaining Eulerian equations 
of mass conservation and entropy advection follow 
from the constraints that so and. pp are constant on 
fluid elements. Time differentiation and the trans- 
formations of [16] and [17] yield 


Op 7 

ae . (pv) =0 [24] 
Os 
a te Mae [25] 


Equations [23]-[25] together with a given function 
U(p,s) and the relation p=p*0U/0p constitute the 
Eulerian description. 

Variational principles similar to that described 
above exist for essentially all ideal fluid models, 
including incompressible flow, magnetohydrody- 
namics, the two-fluid equations of plasma physics, etc. 
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Eulerian Action Principles 


Some early researchers sought variational principles 
that directly produce the ideal fluid equations in 
Eulerian form. Because the Eulerian form of the 
equations does not treat the fluid as a collection of 
particles, the resulting action principles possess a 
certain awkwardness. Below, we describe three 
approaches to such action principles. 


Clebsch action The action principle for electromag- 
netism proceeds by introducing the 4-vector potential. 
In a similar way, the Clebsch action principle 
anticipates this idea by using a potential representation 
of the velocity field, an example being that of [18]. 

Although compressible flow with an arbitrary 
equation of state can be treated in full generality, for 
simplicity and variety we will restrict to incompres- 
sible flow and set V-v=0O. This constraint is 
enforced by requiring 由 to be dependent on a and 
B according to ó[o, 9] :2 — A^ (aV8), where A^! is 
the inverse Laplacian. The Clebsch action is then 
written as follows: 


Sela, fl = | dt | dr bay - 5 26 


where the subscript ¢ denotes differentiation at fixed r, 
we have set p= 1, and v is a shorthand for the 
expression of [18] with $= ó[o, 8]. The form of Sc is 
that of the phase-space action that produces Hamilton's 
equations upon independent variation of the configura- 
tion space coordinate and its conjugate momentum, 
which are here o and 5, respectively. Thus, we require 
da(r, to) = a(r, t4) — 0, but no condition is needed for 
63 at to1. We also require 7-v=0 on OX. The 
variations 6$c/6/8 — 0 and 68c/6o = 0 imply 


" 27] 
= —v-VG=0 


an  infinite-dimensional version of [1] with 
LI" d'ri2/2. Evidently, both a and j are 
advected by the flow. | 

Because the vorticity, Ç :— V x v= Va x VD, knowl- 
edge of a and 8 determines ¢ and one can invert the curl 
operator to obtain v in the usual way. The intersection of 
level sets of œa and 8 define vortex lines, and, evidently, 
these quantities, like the entropy for compressible 
dynamics, are constant on fluid elements. It is not 
difficult to show that the advection of a and 8 implies 
the correct dynamical equation for incompressible v. 


Herivel-Lin action The Herivel-Lin action incor- 
porates [24] and [25] as constraints with Lagrange 
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multipliers, y and p8. (Here 8 is not the Clebsch 8 
and the factor of p is included for convenience.) It 
was discovered early on that these constraints were 
not enough to achieve complete generality and so a 
new one, known as the Lin constraint, was added. 
The Lin constraint corresponds to constancy of the 
fluid particle label. One defines an Eulerian label 
field by setting g(a,t)=r and solving for the label 
a—q (r,t) —a(r,t). Conservation of particle identity 
is thus given by a; + v - Va — 0, and this constraint is 
associated with a Lagrange multiplier y = (71,72, »3). 
The Herivel-Lin action is thus given by 


Suv, p,s, 4; p, B, ^y] 


ti 1 
-[«[ dr( 5 — pU(p,s) + eps -- V - (pu) 
— pfls, +v- Vs] — py - lau v- va] [28] 


Variation of [28] with respect to the Lagrange 
multipliers just reproduces the constraints; however, 
variation with respect to v, p, s, and a produces 
equations that imply [23]. Moreover, every flow can 
be shown to be an extremal of Sy. 


Euler-Poincaré-Hamel action Another approach is 
to use directly constrained variations. The essential 
idea is to only consider Eulerian variable variations 
that are induced by underlying Lagrangian variable 
variations óq, the so-called dynamically accessible 
variations. Explicitly, a basic Eulerian variation 
n=(m,72,n3) is given by n(r,t)=6q(a, t)], 9-1(y,1): 
In terms of this quantity, the dynamically accessible 
variations of the Eulerian velocity field, density, 
and entropy are given, respectively, by óv =m +v . 
Vn —17- Vv,óp— —V - (pr), and ós— —7- Vs. Upon 
inserting them into the variation of 


Swbj- f f Pr p . pU(o..) 29 


and integrating by parts gives 


t 
Simi = | dt | &à..]-m-0 
to JQ) 


where |...] is equivalent to [23]. Thus, assuming 7. 


is arbitrary, we obtain directly the equation of 
motion. 

There is a version of this kind of constrained 
variational principle for all ideal fluid and plasma 
equations. Also, it possesses a geometric interpreta- 
tion. In a more practical vein, constrained variations 
can be used to derive reduced models, and dynami- 
cally accessible variations can also be used for 
stability calculations. Exploring these ideas is out- 
side the present scope. 


Fluid Hamiltonian Description 


Having described variational principles, we turn 
to the associated canonical and noncanonical 
Hamiltonian descriptions. 


Canonical Description 


Because the action of [19] is of standard form, it is 
convex in g and the Legendre transform follows 
easily: the canonical momentum densi is 
7;(a,1) :- 6L/6d'(a) - pod; and H[g,7|]= Je d'a[z - 
q-L)= |. d^a[12/(2p9) + poU]. Hamilton's equa- 
tions are then 


Go OH x] aani MER - n 
(=F =H, i= -ggm {rH} BO 


an infinite-dimensional version of [1], with the 
canonical Poisson bracket 


Eo f ERIT dia [31] 
Jn 


iq or bq 6m 
(Note, óq'(a)/6q'(a') —66(a — a’), a relation analo- 
gous to Óq! /0q' — 6. for finite systems.) 


Reduction to Noncanonical Poisson Brackets 


Reduction is a procedure for reducing the size of a 
Hamiltonian system. Given constants of motion in 
involution, that is, with pairwise vanishing Poisson 
brackets, the dimension of a Hamiltonian system 
can be reduced by 2 for each such constant of 
motion. However, when constants do not commute, 
the situation is more complicated and one must 
invoke a theory due to Lie, Poincaré, Cartan, and 
others. Associated with invariants are symmetries, 
and so a complete discussion of this theory requires 
examination of symmetry groups and associated 
geometry. For the ideal fluid, the map from the 
Lagrangian to the Eulerian descriptions is an 
example of reduction, whereby the Poisson bracket 
of [31] is mapped into a noncanonical Poisson 
bracket. En route to describing this example, a brief 
discussion of reduction of finite systems is consid- 
ered first. 


Reduction of Finite-Dimensional Systems 


Consider a canonical system with the phase space 
M, a 2N-dimensional symplectic manifold. In a 
coordinate patch with coordinates z=(g,p) the 
system has the canonical description of [2]-[4]. 
Suppose we have a map P:.M — m*, where m* is 
some M < 2N-dimensional space described by coor- 
dinates w=(w 1,W2,...,Wwm). In coordinates, this 
map is represented in terms of functions Wa = Walz), 


with a=1,2,...,M, which, because M < 2N, is 
always noninvertible. Suppose f,g:.M — R obtain 
their z-dependence through the functions w, that is, 
f(z) =f(w(z)) =f o w. Making use of the chain rule 
yields 


Of . Og 


= —]4—— 2 
where the quantity 
x. OW, aß OL 
Jab > Oz Je Oz? [33] 


is in general a function of z. However, it is possible 
that Ja may only depend on w. When this happens, 
we have a reduction of the phase space M. 

If the original dynamics of interest has the 
Hamiltonian vector field generated by H(z), and if 
it is possible that H(z) can be expressed solely in 
terms of the w’s, that is, H(z) - H(w), then the 
system has been reduced. Clearly, this is a statement 
of symmetry, since the function H(z) in reality 
depends on a fewer number of variables, the w’s. 

A beautiful form of reduction occurs when the 
map P has a special form w= L'(g)pi, where the 
quantity L is associated with a symmetry group. An 
identity for what is required of L’ in order for the 
transformed bracket to be expressible in terms of the 
w's can be worked out, but this is explained in terms 
of Lie groups. If the space nt is a Lie algebra g, then 
the functions f, are real-valued functions on g* that 
can be extended by left or right translation to 
functions f,g on T*6. Thus, f restricted to T*6 at 
the identity, 776 —g', is f. Because T*Ġ is a 
cotangent bundle, it carries the canonical Poisson 
bracket and we get a natural map P, called a 
momentum map, into the dual of a Lie algebra. This 
geometrical description of obtaining brackets on g* 
from brackets on T*Ó is a case of Marsden- 
Weinstein reduction. In the early- 1980s, these 
authors and others developed the geometrical inter- 
pretation of the noncanonical Poisson brackets for 


the ideal fluid. 


Ideal Fluid Noncanonical Poisson Brackets 


The Euler-Lagrange map of the fluid is of the form 
of the map P above. It maps the canonical bracket of 
[31] into a noncanonical Poisson bracket. If we use 
the Eulerian variables M :— pv, p, and o:— ps, then 
the resulting noncanonical bracket is of Lie-Poisson 
form. To effect this map, one must vary [15]-[17] to 
relate functional derivatives with respect to q and 7 
to those with respect to M, p, and c. This amounts 
to working out the chain rule for functionals. Upon 
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doing this, one obtains the following noncanonical 
bracket: 


óF 0 6G 6G O OF 
F,G}=— | Mi EASEM, EM.B EM. 
{F, G} L p aE 5s sic) 
$F _6G 6G _6F 
+ cna ap EM Vp) 


oF _6G@ 3G GPF) 


This bracket, together with the Hamiltonian 
H[M, p, c] = fp d'r[M?/(2p) + pU(p,c/p)] generates 
the ideal fluid equations. This Hamiltonian follows 
from H[M,p,c]:- H[q,z] with H[q,z]— f; d'a 
[7?/ (259) + poU]. The bracket of [34] is clearly seen 
to be linear in the variables M, p, and c, and the form 
of the cosymplectic operator and structure operators 
G can be obtained by integration by parts. The Lie 
group in this case can be seen to be an extension by 
semidirect product of the diffeomorphism group. 

An alternative form of the noncanonical Poisson 
bracket is given in terms of the variables v, p, and s. 
Upon changing to these coordinates, the noncanoni- 
cal Poisson bracket transforms into 


OF 6G 6G óF 
R Gla a ht eras eel ues 
i LUV Bv p" 3 


" Vxv óG " OF 
p óv ov 
Vs /PF6G  6Gó6FX| 3 
which, with the Hamiltonian H[v,p,s]— fo dr 
[pu*/2 + pU(p, s)], produces the Eulerian fluid equa- 
tions of [23]-[25] directly as v; ^ (v, H}, p; = (p, H}, 
and s;={s,H}, respectively. Observe that in these 
variables, the bracket is no longer of Lie-Poisson form. 


Conclusion 


In a general sense, Hamiltonian dynamics is about 
coordinate changes, and it is clear from the above 
that there is no shortage of coordinates for describ- 
ing the ideal fluid. The most intuitive form of fluid 
equations (at present) is the Eulerian form, and this 
possesses a noncanonical Hamiltonian description. 
Other noncanonical variables are also used for both 
less and more general fluid systems than those 
described above. Vortex dynamics, shallow-water 
theory, and other equations of geophysical fluid 
dynamics are possibilities, as well as equations from 
plasma physics and other disciplines. The general 
story for these systems is much the same as above, 
although in some descriptions constraints are 
involved and they can complicate matters. 
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There are various motivations for pursuing an 
understanding of the Hamiltonian structure of 
fluids, but ultimately these motivations are the 
same as those for investigating the Hamiltonian 
dynamics of particle and other finite degree-of- 
freedom systems. Hamiltonian theory serves as an 
organizing framework, one that can be used for the 
derivation and approximation of systems. If one 
understands something about a particular Hamilto- 
nian system, then often it can be said to be true of a 
general class of Hamiltonian systems. By now, many 
applications have been worked out, some of which 
can be accessed from the literature cited below. 
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Introduction 


The idea of a Hamiltonian flow on a symplectic 
manifold has its roots in Hamilton's equations, 
which govern the trajectory of a particle in phase 
space (the space parametrizing coordinates and 
momenta of a classical particle). A fundamental 
idea in theoretical physics (Noether's theorem) is 
that to every symmetry in a physical system (such as 
a group action), there is an associated conserved 


quantity: invariance under translation corresponds. 


to conservation of linear momentum, invariance 
under rotation corresponds to conservation of 
angular momentum and so on, and these momenta 
are functions on the phase space. The mathematical 
formulation of this idea is the idea of the moment 
map associated to a group action on a symplectic 
manifold; the group action is obtained from the 
Hamiltonian flow of the moment map. 

This article will describe some basic features of 
moment maps associated to Hamiltonian group 
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actions, and some recent results about the geometry 
and topology of symplectic manifolds which have such 
group actions. We first define Hamiltonian group 
actions and list some of their properties. Next we give 
the definition of the symplectic quotient, which is a 
means of dividing out the symmetry to form a new 
symplectic manifold. We also explain some properties 
of the quotient construction. The convexity theorem 
and the moment polytope are outlined and toric 
manifolds (a particular type of symplectic manifold 
with a Hamiltonian torus action of maximal dimen- 
sion) are defined. Finally, we list some properties of 
cohomology rings of symplectic quotients. 

Two standard references on this material are the 
books of Cannas da Silva (2001) and McDuff and 
Salamon (1995). An authoritative and comprehen- 
sive reference is the monograph by Guillemin, 


Ginzburg and Karshon (2002). 


Hamiltonian Group Actions 


Let (M, w) be a symplectic manifold. The Hamiltonian 
vector field £j; generated by a function H is defined by 


eleg, Y) = diia Y) 


for any Y € T4M. If X € gı X# are the vector 
fields on M generated by the symplectic action of a 
compact Lie group G with Lie algebra g, then the 
moment map y:M-—g* is defined by two 
properties: 


1. du, (Y)(X) 2w4(X*,Y) for any Ye€T,M: in 
other words the function jux : M — R defined by 


pix(m) = um) (X) 


is the Hamiltonian function generating the vector 
field X*. 

2. ui: M — g* is equivariant (where G acts on g* by 
the coadjoint action). 


Remark 1 In this article, we shall only consider 
actions of compact connected Lie groups, although 
the definition of Hamiltonian group action may be 
extended to noncompact groups. In particular, 
unless otherwise specified the term "torus" refers 
to the compact torus T = U(1)". 


Remark 2 (Existence and uniqueness of moment 
maps). One sees that Lyyw=d(ty¥w), so that tyw 
is closed. The moment map jx exists if and only if 
Lyw is also “exact.” The moment map need not 
always exist: for example, if S! acts on T? by 


e : (e , e TM (e^ TX) ee) 


we see that for the standard symplectic form 
wu — d61 ^ dé. we have ty4w=d62. Since 05 is only 
defined mod 27 we see that the moment map does 
not exist as a map into R. Conditions guaranteeing 
the existence of a moment map (other than M being 
simply connected) include the hypothesis that G 
is semisimple (Guillemin and Sternberg (1990, 
theorem 26.1); conditions on the existence and 
uniqueness of the moment map can be formulated in 
terms of Lie algebra cohomology (see Guillemin and 
Sternberg (1990)). The obstruction to the existence 
of the moment map for a symplectic action of G is 
an element of H!(g); the obstruction to uniqueness 
of the moment map is an element of H?(g), where g 
is the Lie algebra of G. See Guillemin and Sternberg 
(1990, proposition 24.1). 


Basic Properties of Moment Maps 


Proposition 1  (Guillemin-Sternberg (1982, 1984)) 


Im(dy,,)^ = Lie(Stab(r)) 


where | denotes the annibilator under the canonical 
pairing g* &g-——R. 
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Proof We have 
w( YF Z) = duv(Z) = (Y,dum(Z)) 


nm? 


for all Z € T,,M. Thus, Y annihilates all £ € Im(dy,,) 
if and only if Y € Lie(Stab(z)). 


Corollary 1 Zero is a regular value of n if and only 
if Stab(m) is finite for all m € 1 (0). In this 
situation, u ^ (0) is a manifold and the stabilizer of 
the action at any point in u (0) is finite. 


Example 1 Let T be a torus acting on M and let 
F C MI! be a component of the fixed-point set. Then 
for any f € F, we have duf =0, so nu(F) is a point. 


Proposition 2 


(1) If H C G are two groups acting in a Hamiltonian 
fashion on a symplectic manifold M, then uj; — x o 
ug where v:g' —b' is the projection map. In 
other words, if X € b, then ug (m)(X) = ug (m)(X) 
for any m € M. One example tbat frequently 
arises is tbe case when H = T is a maximal torus of 
a compact Lie group G. 

(ii) More generally if f:H— G is a Lie group 
homomorphism, and the two groups G and H 
act in a Hamiltonian fashion on a symplectic 
manifold M, in such a way that the action is 
compatible with the homomorphism f, then 

Lu —f* o0 ug where f*:g —^b' is induced from 

the homomorphism f. (The case (i) is the special 

case wbere f is the inclusion map.) 

If two symplectic manifolds Mi and M» are acted 

on in a Hamiltonian fasbion by a group G witb 

moment maps pı and m, then the moment map 
for the diagonal action of G on M, x M» with the 
product symplectic structure is pı + u2. 


— 


(iii 


Example 2 The standard symplectic form on 5$? is 
w= —dcos0 ^ dó — —dz ^ dó (where 0 is the polar 
angle, ó is the azimuthal angle, and z is the height 
function). The associated moment map for the action 
of U(1) on S? by rotation about the z axis is pu(z, $) =z. 


Example 3 If R?— C has the symplectic structure 
w=dx Ady, the moment map for the standard 
action of U(1) on R? with multiplicity m € Z, in 
other words the action 


u E€ U(1):zeCox»v"z 
is u(x, y) = —m(x* + y*)/2. 


Example 4 Suppose a torus T acts on C preserving 
the standard symplectic structure, and suppose 
the action factors through a homomorphism 
B:T—U(1) which can be written as 


B(expr X) = expy1)(4(X)) 
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in terms of a linear map 5 € t* that maps the integer 
lattice of t into Z (in other words, a weight) and the 
exponential maps 


expr:£— T 
and 
€XPu(1) : R — U(1) 


(the latter being normalized as expyi) (t) = e^""). 
Then, by Proposition 2(i1) and Example 3 we 
see that the moment map for the action of T on 


C 1s 


u(z) = —1 Bla 


It follows that if T acts on C" via a collection of 
weights /3;,..., 3, € t*, then the moment map is 


l- 2 
n ET EN P E > Bj 


and the image of the moment map is the cone in £* 
spanned by {(1,..., Gn}. 


The Symplectic Quotient 


Since the moment map yp is equivariant, we may 
form the symplectic quotient (or Marsden—Weinstein 
reduction) 


Mea = Mo = yo! (0)/G 


The symplectic structure on M descends to give a 
symplectic structure on Mo. Corollary 1 implies that 
if 0 is a regular value of jj, then Mo is an orbifold. 


Remark 3 Another way to formulate Corollary 1 is 
that if G acts freely, 0 is a regular value of u so 
u™(0) is a manifold with a free G action, and hence 
p (0)/G is also a manifold. If the G action is only 
locally free, then 1/7 (0) is still a manifold, but the 
quotient /二 (0)/G is only an orbifold. 


Remark 4 The definition of orbifold is due to 
Satake (1957); an alternate formulation is given in 
the paper by Henriques and Metzler (2004) and 
references cited there. 


If T is a torus, then the equivariance condition on 
the moment map reduces to invariance, so we may 
form the reduced space M,;=y'(t)/T for any 
regular value £ € t* of the moment map yp; the 
space M, is a symplectic orbifold for any regular 
value t of y. 


Example 5 Let U(1) act diagonally on C" equipped 
with the standard symplectic structure 


W = 33. dz; ^ dz; 
u i 
= 2 , dx; ^ dy; 


j=1 


where z; = x; + iy;. The moment map for this action is 
Io, 2 
p(z1;-.., Zn) = -2. Iz; 
j=1 


so the symplectic quotient p'(—1/2)/U(1) is com- 
plex projective space 


St fo eas Pe 


More generally we may consider the reduced 
space M, =p '(Q,)/G when ON is the orbit in g* 
through A € g* (coadjoint orbit). All such orbits may 


be parametrized by A € t?, where t* is a chosen 
positive Weyl chamber in f*. 


Example 6 Let U(z) act on C" in the standard way, 
where C" is equipped with the standard symplectic 
structure [1]. The moment map for this action is 


ulzi, n) = 52% [2] 
which is the (j,k) element of a matrix in the Lie 
algebra of U(m). The standard symplectic form on 
C" descends under reduction to the standard 
symplectic form on CP"-! (which corresponds to 
the Fubini-Study metric). 


Example 7 (Coadjoint orbits). Let Acg'. We 
define a symplectic structure wą on the coadjoint 
orbit ©, (in terms of the vector fields X*,Y* 
generated by the action of X,Yeg) by 
w (X%, Y”) = —A([X, Y]) at the point A € OA (and 
everywhere else on the orbit by equivariance). The 
moment map for the action of G on O, with respect 
to this symplectic structure is the inclusion of Oy 
in g*. (The symplectic structure on the orbit was 
found by Kirillov and Kostant; see, for instance, 
Berline et al. (1992, section 7.5). 


Example 8 (The shifting trick). 
structure Q on M x Oy by 


Q 2 wM — wy, 


Then for the moment map with respect to the 
induced action of G on M x O, we have 


Define a symplectic 


Corollary 2 Combining Example 6 with Proposition 
2(ii) we see that for any linear action of a group G on 
CP"-! (i.e., an action factoring through a representa- 
tion G — U(n), or in other words an action descending 
from a linear action on C") tbe moment map factors as 


= TOU 


where ji: CP"-! — u(n)' is given in [3] below, and 
":u(n) — g* is the projection map. 


In particular, one often requires for a projective 
manifold M (i.e., a compact complex manifold with 
an embedding into CP"^!) that the action of G 
extends to a linear action on CP"-!, Thus, moment 
maps for such linear actions are given by [3] 
composed with m and with the embedding of M 
into CP"-! (see also Cotangent Bundle Reduction, 
Poisson Reduction, Symmetry and Symplectic 
Reduction). 


Reduction in Stages 


Suppose a compact Lie group G acts in a Hamilto- 
nian fashion on a symplectic manifold M, and H is a 
normal subgroup of G. (For example, this hypoth- 
esis is satisfied if both H and G are tori.) Suppose 
also that 0 is a regular value for jjj and ug. Then 
the symplectic quotient yay (0)/H is acted on 
naturally by the quotient group G/H, and this 
action is Hamiltonian; furthermore, the symplectic 
quotient of y/(0)/H by G/H is naturally iso- 
morphic to pa (0)/G. (This result is known as 
“reduction in stages.") 

Let M be a symplectic manifold equipped with the 
Hamiltonian action of a torus T. Let H C T be a Lie 
subgroup of T (so H is a torus whose dimension is 
smaller than the dimension of T). Let pr:M— 
Lie(T)' and jjj : M — Lie(H)' be the moment maps: 
recall that ug -7Hopr, where amy:Lie(T)* ^ 
Lie(H)' is the standard projection. 

For any 7 € Lie(H)' we may form the reduced 
space M,=¢,;(n)/H. This is equipped with a 
Hamiltonian action of T/H. 


Example 9 Let U() act on C" in the standard way. 
This action descends to an action on CP"-!, which 
is the symplectic quotient of C" under the action of 
the diagonal U(1) subgroup of U(z). Hence, the 
moment map fi for the action of U(z) on CP"! is 
given by the formula 


d ZjZk 
: »- lzel” 


which comes from the moment map [2] for the 
action of U(n) on C". 


Allens- zo] [3] 
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The Normal Form Theorem 


There is a neighborhood of (0) on which the 
symplectic form is given in a standard way related to 
the symplectic form wọ on M, (see, e.g., Guillemin 
and Sternberg (1990, sections 39-41)). 


Proposition 3 (Normal form theorem). Assume 0 is 
a regular value of u (so that yu (0) is a smooth 
manifold and G acts on u^ (0) with finite stabilizers). 
Then there is a neighborhood U = u (0) x {z € g*, 
z| <b} C u (0) x g* of w (0) on which the sym- 
plectic form is given as follows. Let P% 
p ^(0)2.M,, be the orbifold principal G-bundle 
given by the projection map q:u (0) — u *(0)/G, 
and let 0 € Q' (P) & g be a connection for it. Let wo 
denote the induced symplectic form on M, in other 
words q* wo — iyw. Then if we define a 1-form v on 
UCPxg* by r,,,—z(0) (for p € P and z €g"), the 
symplectic form on U is given by 


w = q'wo- dr [4| 


Furtber, the moment map on U is given by 
pp, z) — z. 


Corollary 3 Let t be a regular value for the 
moment map for tbe Hamiltonian action of a torus 
T on a symplectic manifold M. Then in a neighbor- 
hood of t, all symplectic quotients M, are diffeo- 
morphic to M, by a diffeomorphism under which 
Ut =u, +(t—to,d0) where 0 € Q'(u(tg)) @t is a 
connection for tbe action of T on u^ (to). 


Corollary 4 Suppose G acts in a Hamiltonian 
fashion on a symplectic manifold M, and suppose 
0 is a regular value for the moment map u. Then the 
reduced space My=1 '(O,)/G at the orbit Oy 
fibers over Mo = u *(0)/G with fiber the orbit Oy; 
furthermore, if 1x : My — My is the projection map, 
then the symplectic form wy on u *(O4)/G is given 
ds wy — 7" wo + Qa, where wo is the symplectic form 
on Mo and Q, restricts to the standard Kirillov- 
Kostant symplectic form on tbe fiber. 


Convexity Theorems 


Theorem 1 (Atiyah (1982); Guillemin-Sternberg 
(1982 and 1984). Suppose M is a connected 
compact symplectic manifold equipped with a Hamil- 
tonian action of a torus T. Then the image u(M) is a 
convex polytope, the convex hull of (u(F)), where F are 
the components of the fixed-point set of T in M. 


Example 10 Consider the orbits ©; of SU(2) in 
su(2) =R? through t€ R*. The image of the 
moment map for the action of the maximal torus 
T = U(1) is the interval [—£, t]. 


604 Hamiltonian Group Actions 


Example 11 When ©, is the coadjoint orbit 
(through ¢ € t*) for a compact Lie group G with 
maximal torus T, the image j7(O,) of the moment 
map #r for the action of the maximal torus T is the 
convex hull Conv{wt:w € W}, where W is the Weyl 
group. 


The convexity theorem above can be generalized 
to actions of nonabelian groups. If M is a connected 
compact symplectic manifold equipped with a 
Hamiltonian action of a compact Lie group G with 
maximal torus T and positive Weyl chamber f,, 
then the intersection of the image p(M) of the 
moment map with the positive Weyl chamber t, (in 
other words, a fundamental domain for the action 
of the Weyl group on f) is a convex polytope. 
This result is due to Kirwan (1984b) and for Kahler 
manifolds to Guillemin and Sternberg (1982 and 
1984). 

The proofs of Atiyah and Guillemin-Sternberg are 
based on Morse theory applied to the moment map. 
A key ingredient in the proofs is to establish that the 
fibers of the moment map are connected. 


The Moment Polytope 


Given a compact symplectic manifold M equipped 
with the Hamiltonian action of a torus T, we see 
that there is an associated polytope P, the *moment 
polytope.” The fibers of the moment map p are 
preserved by the action of T, so the value of p 
parametrizes a family (M;] of symplectic quotients. 
By Theorem 1 the moment polytope is the convex 
hull of the images of the fixed-point set under the 
moment map. 

By Proposition 1, we see that the moment 
polytope is decomposed according to the stabilizers 
of points in the preimage, and the critical values of 
the moment map are the images HUT(W)) of the 
fixed-point sets W; of one-parameter subgroups S; 
of T. These critical values form hyperplanes 
(“walls”) which subdivide the moment polytope: 
the complement of the walls is a collection of open 
regions consisting of regular values of the moment 
map. 


Example 12 The group SU(3) has maximal torus 
T = U(1)*. We identify g* with g via the bi-invariant 
inner product (i.e., the Killing form) on g, and thus 
identify £* with t. For A € t, the Weyl group images 
of A are the six vertices of a hexagon: the “walls” in 
the moment polytope for the action of T on the 
coadjoint orbit ©, arising from the action of G on 
g' through Ac€t* are the edges of the hexagon 
(exterior walls) and the three lines connecting 
opposite vertices (interior walls). 


Toric Manifolds 


Definition 1 A toric manifold is a compact 
symplectic manifold M of dimension 2n equipped 
with the effective Hamiltonian action of a torus T of 
dimension 7. 


Example 13 Complex projective space CP" with 
the obvious Hamiltonian action of U(1)" c U(1)"*! 
is a toric manifold. 


Example 14 A special case of Example 13 is the 
2-sphere $^ œ CP! (with the action of U(1) given by 
rotation around one axis). The 2-sphere is a toric 
manifold. 


Elementary Properties of Toric Manifolds 


If M is a toric manifold, the fiber of the moment map 
for the action of T is an orbit of the action. Hence, 
the symplectic quotient M; at any value t € t^ is a 
point (if it is nonempty). 

The regular values of p are the interior points of 
the moment polytope P. All points in the preimage 
p (OP) are fixed points of some one-parameter 
subgroup of T. Points in the interior of a face P; of 
dimension j are fixed by a subtorus of T of 
dimension n —j. Hence, each fiber of jz over a 
point in P; is a quotient torus of dimension j. In 
particular, the vertices of the polytope are the 
images of the components of the fixed-point set of 
the whole torus T, and the inverse image of a vertex 
is contained in the fixed-point set of T. 

The push-forward function ju,(w”"/n!) under the 
moment map is just the characteristic function of the 
moment polytope. 


Delzant's Theorem 


In fact, toric manifolds are characterized by their 
moment polytopes. A theorem of Delzant (1988) 
says that any polytope P satisfying appropriate 
hypotheses (a simple polytope) is the moment 
polytope for some toric manifold; furthermore, if 
two toric manifolds acted on effectively by a torus T 
have the same moment polytope, then they are 
T-equivariantly symplectomorphic. The first state- 
ment is proved by constructing a toric manifold 
which has the polytope P as its moment polytope; if 
P has d faces of codimension 1, one constructs the 
toric manifold M as a symplectic quotient of a 
vector space V & C? by the linear action of a torus 
T* & Line *, The torus T ~ U(1)" acting on M is 
then obtained by reduction in stages, as the quotient 
of U(1)" by T". 

The construction of a toric manifold whose 
moment polytope is a given simple polytope is 
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given in Guillemin (1994, chapter 1). The second 
statement (namely that toric manifolds are classified 
by their moment polytopes) is proved in Delzant 
(1988). 


Example 15 The moment polytope for the action 
of U(1)” on CP" is the n-simplex. This action 
descends from the action of U(1)"*! on C"*!, using 
reduction in stages: recall from Example 5 that we 
constructed CP" as the symplectic quotient of C”! 
by the standard action of U(1). 


Cohomology Rings of Symplectic 
Quotients 


For material on the equivariant cohomology of 
symplectic manifolds equipped with Hamiltonian 
group actions and the relation to the fixed-point set, 
we refer to Equivalent Cohomology and the Cartan 
Model. As in that reference we shall describe 
the equivariant cohomology of a Hamiltonian 
G-manifold using the Cartan model. 

Two fundamental results of Kirwan give comple- 
mentary descriptions of the equivariant cohomology 
of a symplectic manifold. 


Kirwan Injectivity 
Kirwan’s first theorem is the injectivity theorem: 


Theorem 2 (Injectivity theorem). If T is a compact 
torus and M is a Hamiltonian T-space, then the 
direct sum of restriction maps to all components of 
the fixed-point set 


Ər : Hr(M) > H'(F) 8 S(t") 
is injective. 
The proof appears in Kirwan (1984a); this 


material is treated in Equivalent Cohomology and 
the Cartan Model (theorem 6.6). 


Kirwan Surjectivity . 


Let G be a complex torus, and let 0 be a regular 
value of the moment map pz. Suppose M is a 
compact symplectic manifold equipped with a 
Hamiltonian action of a compact Lie group G. 
There is a natural map «:H@(M)— H*(Meeg) 
defined by 


r: HG(M) 5 Hi (u^! (0)) = H* (Mya) 


(where the first map is the restriction map and the 
second is the identification of H} (Z) with H*(Z/G) 
when G acts locally freely on Z and the cohomology 
is taken with rational coefficients). The map x is 


obviously a ring homomorphism. Kirwan's second 
theorem treats the image of x. 


Theorem 3 (Surjectivity theorem). Under the 
above hypotheses, tbe map « is surjective. 


The proof of this theorem (Kirwan (1984a, 5.4 and 
8.10); see also Kirwan (1992, section 6)) uses the 
Morse theory of the *Yang-Mills function” |u|": 
M — R to define an equivariant stratification of M 
by strata $5 which flow under the gradient flow of 
|u|? to a critical set Cg of |u|^. One shows that the 
function |u| is equivariantly perfect (i.e., that the 
Thom-Gysin (long) exact sequence in equivariant 
cohomology decomposes into short exact sequences, 
so that one may build up the cohomology as 


Hc(M) = Hi (u~ (0)) e GB He (Sa) 
83:0 


Here, the stratification by $5 has a partial order >; 
thus, one may define an open dense set U5 — M — 
U..4$., which includes the open dense stratum S, of 
points that flow into u (0) (note S, retracts onto 
p! (b)). The equivariant Thom-Gysin sequence is 


n—2d(B; LET n 
— He; P5) 5 HE(Ug) + HE(Ug — Sg) > --- 


To show that the Thom-Gysin sequence splits into 
short exact sequences, it suffices to know that the 
maps (ig), are injective. Since /5(/5), is multiplication 
by the equivariant Euler class eg of the normal 
bundle to Sg, injectivity follows because this 
equivariant Euler class is not a zero divisor (see 
Kirwan (1984a, 5.4) for the proof). 

Because & is a surjective ring homomorphism, it 
follows that 


AG (Mrea) = Hz(M)/Ker(s) 


The above theorem is also valid when G is the 
complexification of a compact semisimple Lie 
group. In this case, one must reduce at 0 (because 
of the condition that the moment map is equivar- 
iant, since b — 0 is the only value which is invariant 
under the coadjoint action). The case of reducing at 
coadjoint orbits can be treated using the proof for 
the case of reducing at 0 via the shifting trick 
(Example 8). 

Several recent articles (Jeffrey and Kirwan (1995, 
1997), Tolman and Weitsman 2003) compute 
Ker(&). Some articles compute Ker(«) in specific 
examples, notably the action of $! on products of 
two-dimensional spheres of general radii. 


The Residue Formula 


One approach to identifying Ker(A) is the “residue 
formula," Jeffrey and Kirwan (1995), theorem 8.1: 
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Theorem 4 (Jeffrey and Kirwan (1995), corrected 
as in Jeffrey and Kirwan (1997)). 

Let n € Hi(M) induce no € H*(Myeq). Then we 
have 


/ r(nje = ng C^ Res Po |» rian) [5] 
Mired 


FEF 


where no is the order of the stabilizer in G of a 
generic element of 1 (0), and the constant CC is 


defined by 


E (=i 
|W|vol(T) 


We have introduced s= dim G and l= dim T; here 
n, =(s — l)/2 is the number of positive roots. Also, 
F denotes the set of components of the fixed-point 
set of T, and if F is one of these components, then 
the meromorphic function h} on t & C is defined by 


"(xy — inaa) [ ENX) 
ix) = eno | EE 7 


and the polynomial D:t— R is defined by D(X) = 
[Lio (X), where y runs over the positive roots of G. 


in 日 


The residue map Res is defined on (a subspace of) 
the meromorphic differential forms on ¢t@C: its 
definition depends on some choices, but the sum of 
the residues over all F € F is independent of these 
choices. When T —U(1), we define the residue on 
meromorphic functions of the form e?*/XN when 
A #0 (for N € Z) by 


iAX iAX 
Res (Sm) — Resx-o m: 


=Q. HAsO 


More generally, the residue is specified by certain 
axioms (see Jeffrey and Kirwan (1995, proposition 
8.11)), and may be defined as a sum of iterated 
multivariable residues Resyx,—),...Resx,—), for a 
suitably chosen basis of f£ yielding coordinates 
X1,..., Xj (see Jeffrey and Kirwan (1997)). 


The Tolman-Weitsman Theorem 


The Tolman and Weitsman (2002) theorem is as 
follows: 


Theorem 5 We have 


Ker(x) = > (KE @ Ki) [8] 
5 


Here, S is a generic circle subgroup of T and 天 > 
(resp. K$ ) denote the set of all equivariant cobomol- 
ogy classes n whose restriction to F’ (resp. FÈ) is 
zero. Here, 


Fx = {Fe F:+ps(F) > 0} 


where us is the component of the moment map in 
the direction of the Lie algebra of S. 


For more information, see Intersection Theory, 
Moduli Spaces: An Introduction, and Equivariant 
Cohomology and the Cartan Model. 
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Introduction 


In general relativity there are several levels within 
the framework of symplectic reduction of Einstein’s 
equations at which one could attempt to define a 
Hamiltonian for the gravitational dynamics of a 
spatially closed universe. At the most basic unre- 
duced level, this Hamiltonian is simply a linear 
function of the Einstein constraints and thus 
vanishes for any solution of the field equations. At 
the other extreme, at the deepest fully reduced level, 
one affects a transformation to a complete set of new 
canonical variables, the so-called “observables,” 
which Poisson-commute with all of the constraints. 
At this level, the relevant Hamiltonian vanishes 
identically since each of the new canonical variables 
is a constant of the motion. 

There is, however, an intermediate level wherein, 
after making a suitable choice of coordinate gauge 
and imposing the constraint equations, one can 
define a nonvanishing Hamiltonian that generates 
the gauge-fixed and constrained evolution equations 
and whose global infimum as a function on the 
relevant reduced phase space has direct topological 
significance. For the large class of manifolds on 
which this Hamiltonian can be defined, it has the 
attractive feature of globally monotonically decaying 
in the direction of cosmological expansion and thus 
evolves in such a way so as to seek and, in certain 
cases at least, to asymptotically attain its infimum 
value in the limit of this expansion. This Hamilto- 
nian provides in these cases a weak Lyapunov 
function for the dynamics that can be used to 
partially control its global behavior. Since under- 
standing the global behavior of solutions to 
Einstein's equations and its dependence upon the 
spatial topology is one of the central open problems 
in classical general relativity, the mathematical 
properties of this quantity are worthy of study. 


Further information and details regarding the 
authors’ work discussed in this article can be found 
in Fischer and Moncrief (2000, 2002a, b) and in the 
references therein. 


Topological Background 


Einstein's field equations are nonvacuous and 
compatible with the introduction of material sources 
in (n + 1) dimensions for all n > 2, the case of most 
physical interest being of course n= 3. For the field 
equations to be deterministic in a classical sense, 
that is, for the Cauchy problem to be well-posed, it 
is essential that they be formulated on a manifold 
that is globally hyperbolic and, in particular, has a 
product topology M x R (roughly, space x time — 
spacetime) where M is a smooth (C*) connected 
manifold of dimension n and R is the real line. For 
the case of spatially closed universes of interest here, 
M should be closed, that is, compact and without 
boundary. To simplify the analysis further, we also 
assume that M is oriented, that is, orientable and 
an orientation has been chosen. Thus, unless stated 
otherwise, throughout this article M will denote 
a smooth closed connected oriented n-manifold, 
n > 2, and all maps will be smooth. 

Let “x” denote the diffeomorphic equivalence 
relation between smooth manifolds. Let S" denote 
the unit 7-sphere in Euclidean (#+1) space 
R"*!.5 > 1. An n-manifold M is trivial if M ~ S” 
and nontrivial if M æ S". 

The connected sum M#N of two closed connected 
oriented z-manifolds M and N is constructed by 
removing the interior of an embedded closed z-ball in 
M and N, respectively, and then identifying the 
resulting $" !-boundary components by an orienta- 
tion-reversing diffeomorphism of the (n — 1) spheres. 
The resulting manifold is smooth, connected, closed, 
and orientable, and is naturally oriented by the 
orientations on M and N. Up to orientation-preserving 
diffeomorphism, this construction is independent of 
the choice of the embeddings of the z-balls and of the 
choice of the orientation-reversing diffeomorphism 
used to join the manifolds together. 
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Let M be a nontrivial closed connected oriented 
n-manifold. Then M is prime if M c Mi M; 
implies that either Mi œ~ S” or Mə ~ S" (but not 
both since we are assuming that M is nontrivial). 
M 1s a composite if M can be written as a nontrivial 
connected sum, that is, if M ~ M,;#M) where both 
M; # S" and M; # S". 

Note that with this definition, S" itself is not 
prime. This is analogous to the fact that for the 
positive integers, the unit 1 is not prime. 

Now let M be a connected n-manifold without 
boundary (not necessarily compact or orientable) and 
let x be a group. Then M is a K(m, 1)-manifold if M is 
an Eilenberg-MacLane space, that is, if its first 
homotopy group (or fundamental group) mı(M) =n 
and if all of its higher homotopy groups are trivial, that 
is, 7(M)=0 for i> 1 (equivalently, the universal 
covering space M of M is contractible). Since the 
higher homotopy groups 7;(M), i > 1, can be inter- 
preted as the homotopy classes of continuous maps 
S' — M, each such map must be homotopic to a 
constant map. Thus a K(mn, 1)-manifold is said to be 
aspherical. Moreover, at the level of homotopy, all of 
the information about the topology of M is contained 
in 74(M) — m. Thus, in particular, if f is a map between 
connected aspherical manifolds that induces an iso- 
morphism on their fundamental groups, then f is a 
homotopy equivalence. Consequently, any two con- 
nected aspherical manifolds are homotopy equivalent 
if and only if their fundamental groups are isomorphic. 

It is useful to define a connected z-manifold M to 
be hyperbolizable if there exists a complete Rieman- 
nian metric g on M with constant negative sectional 
curvature, K(g) —constant < 0. We introduce this 
terminology to emphasize the underlying topology 
of manifolds that can support hyperbolic metrics 
rather than the geometry of such metrics. Similarly, 
M is of flat type if M admits a complete flat 
Riemannian metric g, K(g) — 0, and M is of spherical 
type if M admits a complete Riemannian metric g on 
M with constant positive sectional curvature, 
K(g) — constant > 0. In this latter case, by the Bon- 
net-Myers theorem, M is necessarily compact and if n 
is odd, then by Synge's theorem, M is necessarily 
orientable. In fact, all such manifolds have been 
classified. As an important example, we note that a 
connected 3-manifold M is of spherical type if and only 
if it is diffeomorphic to a spherical space form S? /T,, 
where T is a finite subgroup of SO(4) acting freely and 
orthogonally, that is, isometrically, on $°. 

Within the class of K(m, 1)-manifolds are all flat- 
type and hyperbolizable z-manifolds, since any such 
manifold is isometrically covered by R" in the flat case 
and homothetically covered by H" in the hyperbolic 
case, where H" is the standard single-sheeted spacelike 


hyperboloid with constant sectional curvature K — — 1 
embedded in (7 + 1)-Minkowski space R7*?. 

We now return to our standard assumptions on M, 
so that M is connected, closed, and oriented. For 
n=2, these assumptions restrict the possibilities to 
S^, T^, and the orientable higher genus surfaces 
Sal ere x: T? (p factors) consisting of the 
connected sum of p copies of T^,p » 2. However, 
from the point of view of (2 + 1) gravity, unless one 
includes material sources or a cosmological constant, 
the spherical case is vacuous in that there are.no 
vacuum solutions of the field equations on $^ x R. 
The torus case is nonvacuous but the solutions, the 
so-called flat Kazner spacetimes, can all be found by 
elementary means, Thus only the case of genus p > 2 
surfaces presents problems of interest. 

For n = 3, although not essential for the program of 
reduction, it is convenient to assume the elliptization 
conjecture of 3-manifold topology. This conjecture 
asserts that a closed connected 3-manifold M with 
finite fundamental group 7(M) must be diffeo- 
morphic to a spherical space form S?/T, where, in 
such a quotient, T will always be a finite subgroup of 
SO(4) acting freely and orthogonally on $? and thus T 
is isomorphic to 71(M). 

The simply connected case is the Poincaré con- 
jecture. The full elliptization conjecture is equivalent 
to the Poincaré conjecture and a conjecture asserting 
that the only free actions of finite groups on $? 
are equivalent to the standard orthogonal ones. 
The elliptization conjecture is part of Thurston's 
geometrization program (Thurston 1997). For back- 
ground information regarding 3-manifold topology, 
see Hempel (1976) and Jaco (1980). 

Under the assumption of the elliptization con- 
jecture, the Kneser-Milnor prime decomposition 
theorem asserts that if M is nontrivial, then up to 
order, M is uniquely diffeomorphic to a finite 
connected sum of the following form: 


S? Ti3: --- # S? /T, 


k spherical factors 


Mz 


" (S! x S*) 4E --- 4E (S! x S?) 
———M—— 


l wormholes (or handles) 


# K(m1, 1) #-+-# K(X, 1) [1] 


m aspherical factors 


where k, l, and m are integers > 0, kb --1-- m > 1, 
and if either k, l, or m is 0, terms of that type do not 
appear. Moreover, if k > 1, then each Tj, 1 <i<k, 
is a finite nontrivial (T; Æ (1]) subgroup of SO(4) 


acting freely and orthogonally on $?, and if m > 1, 
then each aspherical factor is a K(m;, 1)-manifold, 
1<j<m, and thus is universally covered by a 
contractible manifold. 

We remark that although in general a contractible 
3-manifold need not be R?, conjecturally the 
universal covering manifold of a K(m, 1) 3-manifold 
is diffeomorphic to R*. 

In 3-manifold topology, a concept closely related 
to that of a prime manifold is that of an irreducible 
manifold. A closed 3-manifold M is irreducible if 
every embedded 2-sphere in M is the boundary of an 
embedded closed 3-ball. 

An embedded 2-sphere that does not bound such a 
3-ball is essential. Thus in the prime decomposition [1] 
above, M is decomposed along essential 2-spheres. For 
this reason, the prime decomposition is sometimes 
referred to as the sphere decomposition. 

With the exception of $? which is irreducible but 
not prime (by definition of prime) and S! x S? 
which is prime but not irreducible, a closed oriented 
3-manifold is prime if and only if it is irreducible. 
We also remark that the Poincaré conjecture, when 
taken in the form that there do not exist any fake 
3-cells, is equivalent to every K(m,1) 3-manifold 
being irreducible. Thus in this article, since we are 
assuming the elliptization conjecture and hence the 
Poincaré conjecture, every K(m,1) 3-manifold will 
automatically be irreducible. 

Examples of the kinds of K(m, 1)-factors that can 
occur in the decomposition [1] are as follows (we will 
explain the Seifert and graph designations below): 


1. Non-Seifert manifolds. Closed oriented hyperboliz- 
able manifolds diffeomorphic to H? /T, where T is a 
discrete torsion-free (i.e., no nontrivial element has 
finite order) co-compact subgroup of the Lie group 
Isom*(H?) of orientation-preserving isometries of 
H? which is Lie-group isomorphic to the proper 
orthochronous Lorentz group SO (1, 3). 

2. Seifert manifolds. T? and five other 3-manifolds of 
flat type which are finitely covered by T^. Noting 
that ZZ = T^, we remark that the product manifold 
Sx Xi = S! x T^ — T? is included in this class. 

3. Seifert ‘id Product manifolds S! x X2, p > 2. 

4. Seifert manifolds. Nontrivial circle bundles over 
Epi. 

by rank manifolds. Any 3-manifold which fibers 
nontrivially over a circle with fiber Y. ,p> 1. Any 
such manifold is obtained by identifying the 
boundary components of [0,1] x E. with an 
orientation-reversing diffeomorphism E: 也 


Since the handle S! x $^ and spherical manifolds 
S? /T are well understood, under the assumption of 
the elliptization conjecture the task of 3-manifold 
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topology now reduces to understanding the topology 
of the (automatically irreducible) K(m, 1)-factors that 
can occur in the prime decomposition [1]. Since 
essential 2-spheres have already been used to 
decompose M into its prime components, the idea 
now is to use the next simplest 2-manifold, the 
2-torus, to „probe the irreducible K(m, 1)- factos. 

Let i: T? —^ M be an embedding of T^ into a 
closed oriented 3-manifold M. Then the embedded 
torus i(T”), identified with T^, is incompressible if 
the induced mapping of fundamental groups 
f. a (T7) — T1(M) is injective. Thus noncontracti- 
ble loops in T^ remain noncontractible when T* is 
embedded in M, or, in other words, the ambient 
manifold M does not fill in any homotopy hole that 
exists in T^ when standing alone. 

A closed oriented 3-manifold M is a Seifert- 
fibered space, or a Seifert manifold, if M admits a 
foliation by circles. For example, if S! acts freely on 
M, then M is the total space of an S'-bundle over a 
surface M/S! and M is a Seifert-fibered space (see 
examples 2, 3, and 4 above). More generally, if S! 
acts without fixed points (locally free), then M is a 
Seifert-fibered space, and in either case the fibers of 
M are the orbits of the S'-action. 

All spherical 3-manifolds are Seifert fibered with 
base $?. Also, the product manifold S! x S? is Seifert 
fibered, as are all manifolds finitely covered by T°’, and 
thus all 3-manifolds of flat type are Seifert fibered. 
The only nontrivial connected sum that is a Seifert- 
fibered space is P^ # P^. No hyperbolizable manifold 
is Seifert fibered. Thus the remaining Seifert 
manifolds are among the nonhyperbolizable nonflat 
type K(m, 1)-manifolds (i.e., those for which M does 
not admit either a hyperbolic or a flat Riemannian 
metric). 

A generalization of Seifert-fibered spaces are the 
graph manifolds. A closed oriented 3-manifold M is a 
graph manifold if there exists a finite collection (T? } of 
disjoint embedded incompressible tori T C M such 
that each component M; of M\U T? isa Seifert-fibered 
space. Thus a graph manifold is a union of Seifert- 
fibered spaces glued together by toral automorphisms 
along toral boundary components. The collection of 
tori may be empty so that, in particular, a Seifert- 
fibered manifold is a graph manifold. 

We remark that the manifolds described by example 
5 above are graph manifolds. We also remark that 
graph manifolds are closed under connected sums so 
that a graph manifold may be a composite. This 
contrasts with the situation for Seifert spaces which, 
with the exception of P? # P?, are not composites. 

Conjecturally, the most general K(m, 1)-manifold, 
not included in the list above, consists of “gluing 
together" across disjoint embedded incompressible 
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tori a finite collection of finite-volume-type hyperbo- 
lizable manifolds, that is, noncompact manifolds that 
admit a finite-volume complete hyperbolic metric, 
together with a possibly empty finite collection of 
irreducible graph manifolds with toral boundaries. 
Thus, overall, in this picture, to decompose an 
arbitrary closed oriented 3-manifold M into its 
elementary constituents, one first cuts along essential 
2-spheres to break M down into its prime factors, that 
is, the nontrivial spherical S? /T-factors, the wormhole 
(S! x §”)-factors, and the aspherical K(, 1)-factors, as 
given by [1]. Then one cuts each nonelementary 
K(x, 1)-factor along incompressible tori to separate 
these factors into their final finite-volume-type hyper- 
bolizable and irreducible graph manifold components. 
The graph manifold components can then be further 
broken down along incompressible tori into Seifert- 
fibered pieces, finally yielding the toral decomposition 
of Jaco, Shalen, and Johannson (see Anderson (1997), 
Jaco (1980), and the end of the section “The Reduced 
Hamiltonian” for further details). 

The Thurston (1997) geometrization program, 
which implies that every closed oriented 3-manifold 
has the structure described by the above prime (or 
spherical) and toroidal decomposition, has been the 
subject of recent work by G Perelman (see Anderson 
(2003) and the references therein) who has argued 
that it can be proved by an enhancement of the Ricci 
flow program of R Hamilton (see the collected 
papers edited by Cao et al. (2003)) Without 
entering into the technical issues surrounding the 
completeness of Perelman’s proof, one can simply 
limit one’s attention to 3-manifolds of the above type. 
If geometrization is correct, then no 3-manifolds of 
interest have been excluded. 

Returning to the general case of m-manifolds, in the 
program of Hamiltonian reduction of Einstein's 
equations, an important consideration is under what 
topological conditions on M can the conformal classes 
of M be uniquely represented by a given metric in each 
class. To analyze that question, we introduce the 
concept of the Yamabe type of a manifold. 

Let M be a connected closed oriented z-manifold, 
n> 3. There is no topological obstruction to the 
existence of Riemannian metrics with constant nega- 
tive scalar curvature, so all such manifolds admit a 
Riemannian metric g such that R(g) = —1. However, 
there are topological obstructions for zero scalar 
curvature and positive constant scalar curvature 
metrics on M. To help categorize these topological 
obstructions, we introduce the following terminology: 


1. M is of positive Yamabe type if M admits a 
Riemannian metric g1 with scalar curvature 


R(gi) = 1; 


2. M is of zero Yamabe type if M admits a 
Riemannian metric go with R(go)=0, but no 
Riemannian metric g with R(g)=1; and 

3. M is of negative Yamabe type if M admits no 
Riemannian metric g with R(g)=0. 


The definition of Yamabe type partitions the class 
of connected closed oriented z-manifolds, n > 3, 
into three classes that are mutually exclusive and 
exhaustive. The following rather complete topologi- 
cal information regarding 3-manifolds of negative 
Yamabe type is known. 

Let M be a connected closed oriented 3-manifold. 
Assume that the Poincaré conjecture is true. Then M 
is of negative Yamabe type if and only if M satisfies 
one of the following three mutually exclusive 
conditions: 


1. M is hyperbolizable (and thus is a K(m,1)- 
manifold; see example 1 of K(m, 1)-manifolds); 

2. M is a nonhyperbolizable nonflat type K(m, 1)- 
manifold (see examples 3, 4, and 5 of K(m, 1)- 
manifolds); 

3. M has a nontrivial connected sum decomposition 
(i.e., M is a composite) in which at least one factor 
is a K(x, 1)-manifold; that is, M ~ M' # K(m, 1), 
where M’ % S°. In this case the K(m, 1)-factor may 
be either of flat type or hyperbolizable. 


We remark that (1) is the vast class of closed 
oriented hyperbolizable 3-manifolds. We also 
remark that the six closed orientable 3-manifolds 
of flat type, although  K(m,1)-manifolds, are 
excluded from (2) as they are not of negative 
Yamabe type (they are of zero Yamabe type). Lastly 
we remark that if M is of negative Yamabe type and 
Seifert fibered, then M must be of type (2) (see 
remarks on Seifert-fibered spaces above). 

In any dimension z > 3, a manifold M of negative 
Yamabe type has the property that it admits no 
Riemannian metric g having scalar curvature R(g) > 0 
everywhere on M, or, in other words, every Riemannian 
metric on M has scalar curvature which is negative 
somewhere. For such a manifold M, Yamabe’s theorem 
asserts that each Riemannian metric g on M is uniquely 
globally conformal to a metric y with scalar curvature 
R(y)— —1 (see also [21]). Thus one can represent 
the conformal classes of Riemannian metrics on M 
in a suitable function space setting by an infinite- 
dimensional submanifold 


M1 =M4(M) = {ye M|R(7)=-1} Bj 


of the space M=M(M)=Riem(M) of Riemannian 
metrics on M (see Fischer and Marsden (1975) for 
details). For this reason, we refer to metrics ^ in 
M as conformal metrics. 


The quotient of Mı by the natural action of 
Do = Do(M) = Diffg(M), the connected component 
of the identity of the diffeomorphism group 
D=D(M)=Diff(M) of M, defines an orbit space 
(not necessarily a manifold) 7 — 7 (M), 


Mı 
T=- 3 
which, when M is of negative Yamabe type, we 
define as the Teichmüller space of conformal 
structures on M. 

In two dimensions in the case of a higher genus 
manifold X2, p > 2, this construction leads precisely 
to the conventional Teichmüller space, as discussed 
by Fischer and Tromba (1984). In this case the 
resulting Teichmüller space 


M (£? 
I.T Aum x ROP-6 [4] 


is then a manifold diffeomorphic to R°’~°, which 
then plays the role of the natural reduced configu- 
ration space for the Einstein equations in (2+ 1) 
dimensions. Moreover, these constructions can be 
carried out globally using known global cross 
sections for the D(Z;) action on M (£5). These 
global cross sections can then be used to provide an 
explicit model for the Teichmüller space 7, as a 
finite-dimensional subspace of M. (£2). 

For —3,7 =T (M) plays the analogous role for 
the reduced field equations in (3+ 1) dimensions. 
Moreover, for many 3-manifolds it is possible to 
show that 7 is itself an infinite-dimensional con- 
tractible manifold, rather than something more 
general such as an orbifold or a stratified union of 
manifolds. For technical simplicity, we shall assume 
throughout this article that 7 is a manifold. Our 
results remain valid in the more general case but in 
that case one must work on stratified spaces (see 
Fischer (1970) for results on the structure of orbit 
spaces when they are not manifolds). 

For higher-dimensional manifolds there is no 
analog of the Thurston geometrization program. 
Indeed, it is known that the set of closed n- 
manifolds for » » 4 is so rich that no purely 
algebraic classification is possible. Nevertheless, for 
manifolds of negative Yamabe type, every Rieman- 
nian metric g is still uniquely conformal to a metric 
yE Mı so that the orbit space T = M1/Do still 
represents the Teichmüller space of conformal 
equivalence classes on M. However, in these 
higher-dimensional cases, very little is known 
about the structure of 7. 
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The Field Equations 


Relative to a global time coordinate t= x? and local 
spatial coordinates (x!,...,x") on a connected 
closed oriented z-manifold M, one can express the 
line element of an arbitrary (n + 1)-Lorentzian 
metric with signature (~+---+) (n positive signs) 
in the form 


ds? = "+D g dtd" 
= .N?d?? 十 gi; (dx' = X'dt) (dx! T X'dt) [5] 


where w+ gw denotes the components of the space- 
time metric, 0 < u, v < n, where the Riemannian 
metric g with components gj is the first fundamental 
form induced on each t=constant hypersurface, 
where the time-dependent positive function 
N — N(x,t) > 0 is referred to as the lapse function, 
and where the time-dependent spatial vector field 
X = X(x,t) with components X' =+!) go; g", where 
g/ denotes the inverse of the spatial metric gj; is 
referred to as the shift vector field. 

Let £ denote the dimension length. In this article 
we use the convention that the spatial coordinates 
(x!,...,x") are always dimensionless, but the time 
coordinate t may have a dimension (see [19] and 
[36]). Since the line element ds? [5] has dimension (7 
and the spatial coordinates are dimensionless, the 
physical spatial metric coefficients g; also have 
dimension /?, If the time coordinate t has a 
dimension, then the dimension of the lapse function 
N is such that the quantity Ndt has dimension £ and 
the dimension of the shift vector field X is such that 
the quantity Xdt is dimensionless. 

We now briefly consider the canonical formula- 
tion of Einstein's equations. For more information 
regarding this formulation, see Arnowitt, Deser, and 
Misner (1962) (ADM) or Fischer and Marsden 
(1972) for a global perspective. We remark that 
the canonical formulation of gravity itself is local 
and is valid for any spatial topology of M. However, 
as we shall see, Hamiltonian reduction of gravity 
along the lines described in this article requires the 
topological restriction that M be of negative 
Yamabe type. 

The standard definition of the second fundamen- 
tal form k, or extrinsic curvature, induced on a 
t=constant hypersurface leads to the coordinate 
formula 


1 (Og; 
kj == N (m = a= Xiu) [6] 


where the vertical bar signifies covariant differentia- 
tion with respect to the spatial metric g and spatial 
indices are raised and lowered using this metric. The 
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natural momentum variable conjugate to g turns out 
to be the 2-contravariant symmetric tensor density 7 
(that is, m is a relative tensor of weight 1) whose 
components in a positively oriented local coordinate 
chart (x!,...,x"), that is, in a chart in the orienta- 
tion atlas of M, are given by 


n! = —y/ det gu(k" — (trgk)g") [7] 


where kï = g'^o/'b,, is the contravariant form of k, 
and where 


rang =k = gh; [8] 


is the trace of the second fundamental form, or the 
mean (extrinsic) curvature. From the coordinate 
formula [6] for the extrinsic curvature, we see that 
the components k; have dimension /!/? =£ and 
thus the mean curvature r=trzk=g’k; has the 
dimension 4? l= £7. 

Let ,/det g denote the (global) scalar density and 
djig denote the (global) Riemannian measure on 
M determined by the Riemannian metric g (note 
that here d is not the exterior derivative). Similarly, 
let ug denote the volume element, a nonvanishing 
n-form on M, determined by g and the orientation 
on M. In a positively oriented local coordinate chart 
(2)—(x.,...,47) on M, (Jdetglu,—«/detg;, 
(dpig) (xi) =y det gy dx! dx^ --- dx" = J/detg;; d" x, where 
d”x = dx!dx*---dx” is the Lebesque measure in R”, 
and (ug), = V/detg;; dx! ^ dx? ^---Adx". We adopt 
the convention of suppressing the coordinate-chart 
designation (x') so that one can, for example, write 


with some ambiguity \/detg = ( /detg),,; = \/detg;. 


We let 


vol(M,g)= | m= | dug = | vdetgd'n [9] 


denote the volume of the Riemannian manifold 
(M,g), given by either the integral of the volume 
n-form [lg or the Riemannian measure djig over M, 
which is given in the last integral in its coordinate 
form using the suppressed coordinate-chart conven- 
tion adopted above. As expected, the spatial 
physical volume has dimension (/2)"? = f", 

We shall refer to the canonical variables (g;, 7^) as 
the physical variables, in contrast to the reduced or 
conformal variables (4;,(pl')") to be introduced 
later. 

Note that the mean curvature 7 —tr;k is a scalar 
function on M whereas trgz is a scalar density on M. 
Taking the trace of [7] expresses the mean curvature 
in terms of the canonical variables (g, 7), 


tl gt [10] 


1 
B (n — 1)4/detg 


Using [10], eqn [7] can be inverted to give k in terms 
of g and 7, 


1 1 
ki; ndo dis (n 25 (n — 1) ro) [11] 


and then combined with [6] to give the kinematical 
equation 


Og LN (n -= EE XN (tr T) » 
8r fag (—1) 9^ 
TX X [12] 


In terms of the canonical variables (g,7), a 
Hamiltonian form for the action for Einstein’s 
vacuum field equations can be expressed as 


f. ag: 
[ADM(8, 7) =| at | (xi et - NH(g, m) 

-X Jig, JE [13] 

where [= |[to,ti| C R is a closed interval and where 


the Hamiltonian (scalar) density ?4(g, 7) and the 
momentum (1-form) density .7(g,7) are given by 


i Z 1 2 
H(g,n) = ——= | T- T — — (ttgn)- 
(g, 7) | ri T (tn) 44 
— y det g R(g) 
EP. FETA 
= Jd (sisi m "In (85T ) 
— y det g R(g) [15] 
Tilg,m) = 2(&yn); = —2gim" y [16 


where 7: 7 is the g-metric contraction of m with itself, 
and where, as above, R(g) is the scalar curvature of 
the spatial metric. We also note that each of the three 
terms in the integrand of [13] are global scalar 
densities and thus can be integrated over M without 
any further involvement of the metric g. 

Variation of lap. with respect to the lapse 
function and shift vector field yields the constraint 
equations 


H(g,7) = 0 [17] 
Ji(g,m) — 0 [18] 


which comprise that subset of the empty space 
(1 + 1)-Einstein field equations corresponding to the 
normal-normal and normal-tangential projections 
of the Einstein tensor relative to a t= constant initial 
hypersurface. Variation of Iapm with respect to 7 
reproduces the kinematical equation [12], whereas 


variation of JADM with respect to g; generates the 
complementary tangential-tangential projections of 
Einstein's equations. 

There are no evolution or constraint equations for 
either the lapse function N or the shift vector field X 
and therefore these quantities must be fixed by either 
externally imposed or implicitly defined gauge condi- 
tions. À convenient choice, for which a local existence 
and well-posedness theorem for the corresponding 
field equations can be established in any dimension 
n > 2,1s given indirectly by imposing constancy of the 
mean curvature and a spatial harmonic gauge condi- 
tion on each £— constant slice (see Andersson and 
Moncrief (2003, 2004)). These constant mean curva- 
ture spatial harmonic (CMCSH) gauge conditions are 
given, respectively, by the equations 


Ex [19] 


g (Th(g) — Ti(8))=0 [20] 


where from [10], 7 is a function of the canonical 
variables (g, 7) and where g is some convenient fixed 
spatial reference metric (or background metric) on 
M. The latter condition corresponds to the require- 
ment that the identity map between the Riemannian 
manifolds (M, g) and (M,$) be harmonic. Neither of 
these conditions involves the lapse function or shift 
vector field directly but their preservation in time 
implemented by the demand that the time deriva- 
tives of the given conditions be enforced leads 
immediately to a linear elliptic system for (N, X!) 
which determines these variables. The foregoing 
formalism is easily extended to the nonvacuum 
field equations in the presence of suitable material 
sources whose field equations are amenable to a 
constrained Hamiltonian treatment. To simplify the 
analysis, such sources will be ignored in the present 
discussion. 

For the special case of Einstein gravity in (2 4 1) 
dimensions, there is an elegant, alternative, triad- 
based formulation of the action functional as an 
Isom(R; )-invariant gauge-theoretic Chern—Simons 
action, where Isom(R} ) denotes the full isometry 
group, or the Poincaré group (=the inhomogeneous 
Lorentz group), of (2 + 1)-Minkowski space RÌ. For 
nondegenerate triads the resulting field equations for 
this alternative formulation can easily be shown to 
be equivalent to those of the conventional formalism 
when the latter is re-expressed in terms of triads but 
the new formulation allows for meaningful field 
equations in the case of degenerate triads as well 
and thus suggests a potentially interesting general- 
ization of the theory (see Carlip (1998) for details). 

In any dimension n> 2, there is a well-known 
technique, pioneered by Lichnerowicz (1955), for 
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solving the constraint equations on a constant mean 
curvature (CMC) hypersurface (see Choquet-Bruhat 
and York (1980) and Isenberg (1995)). Of major 
importance for the treatment of Hamiltonian reduc- 
tion is that if n=2 and M =£}, P >2, 0f #23 
and M is of negative Yamabe type, then every 
Riemannian metric g on M is uniquely globally 
pointwise conformal to a metric y which satisfies 
R(y) =—1 (see remark above [2]). Thus, from now 
on, we assume this topological condition on M. In 
this case, every Riemannian metric g on M can be 
uniquely expressed as 


e^?» if n= 2 and M =£}, p > 2 
B=) wor if n > 3 and M is of [21] 


negative Yamabe type 


with the conformal metric y normalized so that 
R(y)=—1 and with the specific form of the 
coefficient conformal factor being chosen to simplify 
calculations involving the curvature tensors. In 
the case n > 3,y is positive and thus the space of 
all Riemannian metrics on M is parametrized by 
Mı and the space of scalar functions y > 0 on M. 
The function y is then determined by solving the 
Hamiltonian constraint [17] (see also the remark 
before [33]). 

In the given CMC slicing and imposing the 
vacuum field equations, since by the momentum 
constraint m must have zero divergence (see [16] 
and [18]), one finds that n” must be expressible in 
the form 


T” 一 (ry) 十 ~ (tror)? [22| 
where z!! is transverse (i.e., divergence-free) and 
traceless with respect to g. In the nonvacuum case, n” 
picks up an additional summand determined by the 
sources in the modified momentum constraint [18]. 

Substitution of the foregoing decompositions of 
(gi, TÏ) into the Hamiltonian constraint leads to a 
nonlinear elliptic equation for y which, under the 
conditions assumed here, determines this function 
uniquely, provided 7 #0. No solutions exist for 
T —0 (equivalently, trem =0) since from [14], [17], 
and [22], the Hamiltonian constraint would then 
immediately imply that 


1 1 


R(g) =—— (r: m =r TS 123 
O= gpg 3 
everywhere on M, which is not possible for a 
manifold M of negative Yamabe type. Instantaneous 
vanishing of the mean curvature, the defining 
property of a maximal hypersurface, would corre- 


spond to a moment at which an expanding universe 
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ceases to expand or a collapsing universe ceases to 
collapse. From [23], such behavior is topologically 
excluded here by the requirement that M be of negative 
Yamabe type (see also the discussion after [36]). 

In the unreduced formalism of Iapm, the role of a 
super-Hamiltonian is played by the functional 


了 可 1) = |. (NH(g, n) + XT, (g,m))d'x [24 


which evidently vanishes whenever the constraints 
are satisfied. To achieve a fully reduced formulation 
wherein again the effective Hamiltonian would 
vanish, one could endeavor to solve the associated 
Hamilton-Jacobi equations 


H(gij, 08/ógi)) = 0 [25] 
Tr (gij, 6S/ógi) = 0 [26] 


for a real-valued functional S= S(g, o^) of the metric 
g and a set of additional independent parameters o^. 
A complete solution S(gj,a“) would be one for 
which an arbitrary solution (gj,7’) of the con- 
straints could be realized as (gj,6$/ógi;) for a 
suitable (unique) choice of the o^. A complementary 
set of reduced canonical variables 84 (the momenta 
conjugate to the a4's) could then be defined by 
Ba —6S/6o^ and one could in principle solve the 
equations 


= ÉS 
n” = ZT 
Oi 27 

óS 
BA = SoA [28] 


for (o^, Ba) as functionals of the canonical variables 
(gi, 1"). This procedure, if it could be carried out, 
would ensure that these functionals (o^ (g, 7), 
GBa(g,7)) Poisson-commute with all of the con- 
straints and hence are conserved for an arbitrary 
slicing of spacetime. Conversely, if a suitable set of 
gauge conditions such as the CMCSH conditions 
were imposed, one could in principle solve for the 
remaining independent canonical variables as func- 
tionals of the (o^, 84) and an internal variable, such 
as the mean curvature 7, which plays the role of 
time, and hence solve the field equations for (gj, n”) 
in the chosen gauge. 

This proposal is purely heuristic in (3 十 1) and 
higher dimensions in that there is no known 
procedure for finding the needed complete solution 
of the Hamilton-Jacobi equations in these cases. 
However, by exploiting the Chern-Simons analogy 
discussed earlier in this section, a complete solution 
can be found in (2--1) dimensions and the 
corresponding complete set of “observables” 


(a^, Ba) identified. The latter are equivalent, up to 
a diffeomorphism of the associated reduced phase 
space, to a complete set of traces of holonomies of 
the flat Isom(R; )-connections defined in this Chern- 
Simons formulation (see Carlip (1998) for more 
details). 


The Reduced Hamiltonian 


We continue with the assumption that M is a 
connected closed oriented z-manifold, with either 
n — 2. and M-E,p > 2, or n > 3 and M of negative 
Yamabe type. We now define the reduced phase 
space as the set of conformal variables given by 


Preduced = {(7,p"") | y € Mi and p" Is à 
2-contravariant symmetric tensor density 
that is transverse and traceless 
with respect to y} : [29] 


We remark that the fully reduced phase space is 
given by Pieduceda/ Do, where Do is the group of 
diffeomorphisms of M isotopic to the identity. 
However, here, for clarity of exposition, we work 
on Preduceq rather than the fully reduced phase space. 

Given a scalar function y, with y > 0 if n > 3, the 
physical variables (g,7'') are related to the con- 
formal variables (y, p! ! ) by 


(g, n^) 


— [enep] 
(ey, (574/(-2)TTY 


ifm=2 [30] 
ifn>3 


We adopt the convention that raising and low- 
ering of indices on either momentum variable a!" or 
p! * will be with respect to its own conjugate metric, 
either g or ^j, respectively. With this convention, the 
mixed forms of «!! and p!! are equal, since for 


n> 3, 


(a Ty, - gm - pe m2) ip 4 72) gTTil 
, 
=p spy, [31] 


(and similarly for the 7 — 2 case). Thus the squared 
norms of p!! and 7!" are equal, 


pit. ptt TTijyTTRI 


= Vik WiP 
TTij, TTk| _ TT TT 
= gkg ^T "-—T 7 [32] 
where in the first term the center dot is y-metric 
contraction and in the last term the center dot is 
g-metric contraction. 

The uniquely determined scalar factor y relating 
the physical metric g to the conformal metric ~y is 
obtained by solving the Hamiltonian constraint 


equation [17]. In the special case that p!" —0 (or 


equivalently, from [30], that n"! =0), v» is constant 


and is given in the 7 > 3 case by 


Thus in this case 


D ig [34] 


—4/(n-2).. (n — 
E = 
n 


Y= 


In particular, since 7 has the dimension /^! (see the 
remark after [8] and the components g; have 
the dimension /?, we see from this formula that the 
conformal metric y; is dimensionless. Although y is 
not constant in the general case when p!! Æ 0, its 
dimension, as in [33], is still /"-2/? and thus the 
components ^j; are still dimensionless in the general 
case. Since in the conventions used in this article, the 
spatial coordinates are dimensionless, the volume 
vol( M, y) of the Riemannian manifold (M, ^), as well 
as all curvature tensors of y, are also dimensionless. 
Having a dimensionless conformal metric y with a 
dimensionless volume has its advantages over the 
physical metric g with dimension /? inasmuch, as we 
shall see below, an infimum of the volume of the 
conformal metric is related to a dimensionless 
topological invariant of M (see [48] and the remark 
thereafter). 

If one now uses the conformal variables given by 
[30] and the decomposition [22] in the ADM action 
given by [13], one finds the reduced action to be 


To 202 一 1) Or 
- ij ' — T, 
Lodicsd - fa f (v Ot 51 ET detg 


2Ó0trgTN ,, 
To 8t Ja X [35] 


In this expression one can discard the final time 
derivative which contributes only a boundary 
integral and so does not contribute to the equations 
of motion. Moreover, the conformal metric yj is 
constrained to lie in the intersection of Mı and a 
slice for the action of Dy on M4. This space can be 
regarded as a local chart for the reduced configura- 
tion space 7 —.ML41/Do, under the technical 
assumption that 7 is a manifold. Thus, taken 
together, the conformal variables (5;,p!'?) can be 
viewed as local canonical coordinates for the 
cotangent bundle T*7 of Teichmüller space 7, 
where T*7 now plays the role of the reduced 
phase space. 

For n=2, these constructions can be carried out 
globally for the Teichmüller space Tp of an arbitrary 
closed oriented surface Eb >2 (see the remarks 
after [4]. Using these global constructions, the 
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reduced phase space T*7, for the (2 + 1)-reduced 
Einstein equations can be modeled explicitly. 

Having restricted the slices to be CMC, one need 
only choose the relationship between the time 
coordinate and the CMC 7 in order to fix a 
corresponding reduced Hamiltonian. The most 
natural choice of time coordinate from the present 
point of view is to take 


t = t(r) = [36] 


Note that this choice of time coordinate, although 
also denoted ż, is no longer dimensionless but has 
dimension +. 

This choice of time coordinate is motivated by 
three considerations. Firstly, we remark that since 
T —0 is excluded in the setting used in this article 
(see [23] and the discussion after), 7 can range in 
either the domain R^ = (—oo, 0) or R^ = (0, 20). The 
usual convention on the sign of k, as adopted here, 
is that the sign of k is negative when the tips of the 
normals on a spacelike hypersurface are further 
apart than their bases, as for example in the 
expansion of a model universe, in which case 
T=trgk < 0. Thus, with this convention, 7 in the 
range R^ corresponds to an expanding universe and 
r in the range R* corresponds to a collapsing one in 
the future direction of increasing t. Thus for 
manifolds of negative Yamabe type that we consider 
here, the expected maximal range of the CMC 7 is R^ 
for which 7 — —oo corresponds to a “crushing 
singular" big bang of vanishing spatial volume and 
r — 0^ corresponds to the limit of infinite volume 
expansion. Then, with the time function given by [36], 
the coordinate time f ranges in the interval R*, 
vanishes at the big bang, and tends to positive infinity 
in the limit of infinite cosmological expansion. 

We remark that to prove that a solution deter- 
mined by Cauchy data prescribed at some initial 
coordinate time to € R^ actually exhausts the range 
R* is a difficult global existence problem that is not 
dealt with here. Nevertheless, one of the main 
motivations for this work is the hope that Hamil- 
tonian reduction will lead to advances in the study 
of the global existence question for Einstein's 
equations. 

We also remark that with the choice of temporal 
gauge function given by [36] and with 7 in its 
natural range R, 


dr n 

— =~—__(-T)" »0 37 

A lean! 7 37] 
so that this temporal coordinate choice preserves the 
time orientation of the flow for all > 2. 
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Secondly, with this choice of temporal gauge, the 
reduced action given by [35] simplifies to 


Teduced = 一 Í dt $. (» PS k. 
= cas) d x [38] 


from which one can read off an effective reduced 
Hamiltonian density, 


T edassdi Ts oF p.) - (=F) det g [39] 


and an effective reduced Hamiltonian, 


T)"\/det g d"x 


ru 


= (—T)"vol(M, g) [40] 


EH reduced. Ys P) = E - 


where vol(M,g)— fy dug, is the volume of the 
Riemannian manifold (M,g). Thus in terms of the 
physical variables (gj, 7"), the reduced Hamiltonian 
H,auced at “time” 7 is simply the volume of the CMC 
slice with mean curvature 7 rescaled by the factor 
(—T)'. With this reduced Hamiltonian density, the 
reduced action [38] takes the canonical form 


ð yi n 
Treduced = = Jaf (» "P r - Hogued jd x [41] 


As the third consideration for the given choice of the 
time function, we note that rescaling the physical 
volume vol(M, g) by the factor (—7)" yields a dimen- 
sionless quantity. Indeed, as we have seen, the spatial 
physical volume has the dimension /" and the constant 
mean curvature 7 has the dimension 4t, so that the 
reduced Hamiltonian (— 7)" vol(M, g) is dimensionless. 

The main advantage of having a dimensionless 
reduced Hamiltonian is that only such a reduced 
Hamiltonian can have a topological significance, 
and indeed, the infimum of Hyeduceq is closely related 
to a dimensionless topological invariant of M (see 
the remarks after [48 ]). 

In terms of the conformal variables (»,p!!), the 
reduced Hamiltonian is found from [21] and [40] to 
be given for n > 3 by 


-r)" | vdegds 

= (-7)" f /der(e4/-2)>) ds 
= tr)" | (etim? dec ds 
-(- f pe) du, [42 


FA attend (T, Ys p.) = ( 


where dy, is the Riemannian measure on M 
determined Dy y (locally, du,=./detyd"x) and 
y=y(t,y,p'') is the conformal factor which, 
through the solution of the Hamiltonian constraint 
[17], is expressed as a function of the “time” t and 
the rn aaa conformal (or canonical) variables 
(3, p! *). 

In the special case n=2,M = xj 5p 22,a simple 
formula for Hyeduced can bs derived. In terms of the 
conformal variables (y,p''), we find from [40], 
[10], [14], [17], [21], [22], and [32] that 


| p PE EE (T; Ys p.) 


R(g))du; 
一 上 人 (det(e?"-y)) | [p^ ur ) du e?) -2 jR g) dhg 
一 i| (e?) (dety) (p! pl") e??dy, — 8mX(Z;) 

Y, 


= 2 | e7% (dety)! (pT .p')dy, 十 167( 力 一 1) [43] 
v5 
where y=y(7,7,p""), x(E;) -2(1—p) is the Euler 
characteristic of the genus p surface X2, and where 
we have used the Gauss-Bonnet theorem 


[Re dug Gp sao p) s 


p 


Since 
H .edacea (T; Y: P ) 
xp 
+ 16z(p — 1) > 167(p — 1) [45] 


the infimum of Hyreduceq iS attained precisely when 
pl! —0 and this infimum coincides with the topo- 
logical invariant -8rX(Zj) = = 16n(p — 1), which char- 
acterizes the surface r (see also [51] below). As we 
shall see shortly, er analogous result holds for 
n> 3. 

A straightforward but lengthy calculation, which 
is valid in arbitrary dimensions, shows that the 
reduced Hamiltonian is strictly monotonically 
decreasing in the direction of cosmological expan- 
sion except for a family of continuously self-similar 
spacetimes for which this Hamiltonian is constant 
(Fischer and Moncrief 2002b). The latter solutions 
exist if and only if M admits a Riemannian metric 
+ E€ M which is an Einstein metric, that is, for 
which the Ricci tensor satisfies Ric(y) — — (1/7). 
Using the mean curvature as a convenient time 
coordinate, that is, temporarily taking t=7, the 


— A o. 


corresponding self-similar vacuum spacetime metrics 
then have the line element 


ds? = -(3 ;) de + (n ET L ydy dx! [46] 


In the case that n=3, the Einstein metric ~y is 
actually hyperbolic with constant pipi curvature 
K(y) — —1/6 and Ricci curvature Kie) 一 (1/3)7. 
Although the conformal variables (^, p fiy. (^, 0) are 
static in this model, the physical variables (g,7) are 
not. In this case, the resulting spacetimes (which 
depend on the underlying topology of M) have 
expanding closed hyperbolic spacelike hypersurfaces 
where the physical volume vol(M, g) *starts" at zero 
at the big bang and expands to infinity in the forward 
time direction, as befits a universe endlessly expand- 
ing from the big bang. Such a universe is depicted in 
Figure 1, where the genus-2 surface is used to 
represent a generic closed hyperbolic 3-manifold. 
The Bianchi and Thurston types of this model are 
discussed in the next section. 

The line element [46] is locally isometric to the 
vacuum . Friedmann-Lemaitre-Robertson-Walker 
(FLRW) k — —1 spacetime, which is well known to 
be flat. Although these spatially compactified mod- 
els are technically not classical FLRW spacetimes 
since the expanding compact hypersurfaces are not 
homogeneous (and thus not isotropic), they are 
Lorentz-covered by the FLRW k-— —1 spacetime 
and thus are locally isometric to this classical 
spacetime. 

The same result leading to [46] holds even if 
matter sources are allowed, provided they satisfy a 
suitable energy condition, in which case the corre- 
sponding reduced Hamiltonian will only be station- 
ary in the vacuum limit and then only when the 
metric is of the above type; otherwise it mono- 
tonically decays. This result even has a quasilocal 


Figure 1 Expansion of the physical universe in the Bianchi V, 
Thurston type H?, spatially compactified FLRW flat spacetime 
cosmology. 
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generalization expressible in terms of the corre- 
sponding quasilocal reduced Hamiltonian defined 
for an arbitrary domain D, within the CMC slice 
T=constant by restricting Hreduceq in [42] to the 
domain D,, so that for n > 3, 


zc 


= 人 / 2) du, (47 


Hp, (7, YP 


If D, is determined from its specification on some 
initial slice 7 — 79, by letting the domain flow along 
the normal trajectories of the CMC foliation, one 
can then verify that Hp, is monotonically decreasing 
except for the vacuum solutions of self-similar type 
described above, in which case Hp, is constant. This 
result is independent of the initial domain chosen. 

We remark that one cannot use the quasilocal 
Hamiltonian to get equations of motion (even 
quasilocally) since the full true Hamiltonian is 
nonlocal and so one gets contributions from the 
whole manifold. 

Since the reduced Hamiltonian H,educeg as well as 
its quasilocal variant Hp, is monotonically decreasing 
for generic solutions of Einstein's equations, it is 
natural to ask what its infimum is and whether this 
infimum is ever attained, at least asymptotically, by 
solutions of the field equations. The infimum of the 
reduced Hamiltonian for n> 3 and for a spatial 
manifold M of negative Yamabe type can be character- 
ized in terms of a certain topological invariant of M 
called the sigma constant o(M) of M. For manifolds of 
negative Yamabe type, this quantity can be defined in 
terms of the infimum of the volume of all metrics which 
range over the space of conformal metrics M1. The 
precise definition leads to the formula 


2 /n 
o(M)=—( inf volM ) |48] 
YEM-1 

Interestingly, this equation defines the topological 
invariant o(M) by a purely geometrical equation 
involving the volume functional restricted to M. 
We also remark that [48] is a dimensionless 
equation, the left-hand side being dimensionless 
since it is a topological invariant of M and the 
right-hand side being dimensionless since the con- 
formal metric and its volume are dimensionless (see 
the remarks after [34]). 

Although the o-constant can be defined for all 
Yamabe types, [48] holds only for manifolds of 
negative Yamabe type. From this equation, one can 
conclude that for such manifolds 


a(M) € 0 [49] 
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One can relate the foregoing to the reduced 
Hamiltonian by showing that the infimum of 
Heeduced defined for arbitrary 7 < 0 as a functional 
on the reduced phase space 


Tr (=) [50] 


is given by 


. TI 
( dead YP ) 
4. 


—8zx(Ej)-16z(p—-1) ifn=2 [51] 
" and p » 2 
H 
(C00) ifn>3 


where for n > 3, M is of negative Yamabe type and 
thus c(M) € 0 (see [49]). 

One proves this result by first showing that within 
an arbitrary fiber of the  cotangent bundle 
T*(M_4/Do), one minimizes Hyeguceq by setting the 
fiber variable p!! to zero. In this case, the solution for 
the conformal factor y reduces to a spatial constant 
which is a function of 7 alone (see [33]), and thus the 
formula for Hyeduced given in [42] reduces to 


n/2 
Fass; 7Y; 0) (=) vol(M, y) [52] 


The infimum over all conformal metrics y € M4 of 
this latter functional yields the o-constant as outlined 
above. If matter sources obeying a suitable energy 
condition are allowed, the argument goes through in 
much the same way with the additional implication 
that the infimum is achieved only for a vacuum 
solution so that in fact the matter must be “turned off.” 

Thus, as a consequence of the above analysis, 
one has 


E reduced (T, af p.) 
= Pessac T 0) = (- 


dc Ri 


n/2 
a -) vol(M, 7) 


n— 1 3] 
n n/2 
= (—— (-o(m))) 53] 
where the last equality follows by inverting [48] to give 
inf vol(M,4) = (—o(M))”’* [54] 
TEM-1 
Moreover, if 7EA 人 1 actually achieves the 


c-constant, that is, if vol(M, ^) — (-e(M))"? (and 
not just asymptotically approaches it as a curve or 
sequence), then y must be an Einstein metric with 


Rich) = —4 55] 


If, additionally, 7 —3, then y must be hyperbolic 
(with constant sectional curvature K(y) = —1/6). 

Although Thurston's conjectures do not refer to 
the o-constant, Anderson (1997) has been able to 
reformulate and somewhat refine the Thurston 
geometrization conjectures for 3-manifolds of arbi- 
trary Yamabe type in terms of conjectured proper- 
ties of the o-constant. Additionally, if Perelman's 
results are technically complete, they would provide 
a proof of Anderson's conjectures as well as those of 
Thurston's (see Anderson (2003)). 

The conjectured behavior for a sequence of 
conformal metrics (5;],y; € M1,i=1,2,..., which 
seeks to minimize the volume of a stand-alone 
K(x, 1) 3-manifold M of negative Yamabe type can 
be described as follows: 


1. If M is hyperbolizable, then o(M)< 0 is attained 
by a hyperbolic metric yp € M_;, unique up to 
diffeomorphism, and the sequence of conformal 
metrics {y;} converges to this metric in a suitable 
function space topology. 

2. If M is a pure graph manifold, then c(M) — 0 and 
the sequence {y;} of conformal metrics “volume 
collapses” M with bounded curvature. Typically 
this occurs through collapse of circular or toroidal 
fibers in the associated circle or 2-torus bundle 
structure (see examples 3, 4, and 5 in the section 
“Topological Background” and see also the penul- 
timate section). The six manifolds of flat type are 
not included here as they are of zero Yamabe type. 

3. If M is a generic K(x, 1)-manifold (not of type 1 
or 2 above), then M can be decomposed along 
incompressible tori into its final finite-volume- 
type hyperbolizable and (possibly empty set of) 
graph-manifold pieces. In this case, o(M) « 0 and 
the sequence {y;} of conformal metrics collapses 
the graph-manifold components and converges to 
finite-volume complete hyperbolic metrics on the 
hyperbolizable components (normalized to have 
R(^) 2 —1) yielding a o-constant that is entirely 
determined by the volumes of these final hyper- 
bolic components (see the final section). 


We shall return to this conjectured characteriza- 
tion of sequences of conformal metrics in the next 
two sections. 


Reduction of Bianchi Models 
and Conformal Volume Collapse 


For manifolds of negative Yamabe type, the strict 
monotonic decay of Hyeduceq in the direction of 
cosmological expansion along nonconstant integral 
curves of the reduced Einstein equations suggests 


that the reduced Hamiltonian is seeking to achieve 
its infimum inf Hyeduced = ((n/(n — 1))(—o(M)))””*. 
But does this ever happen? Does the reduced 
Einstein flow of the conformal geometry asymptoti- 
cally approach inf Hyeduceq in the limit of infinite 
cosmological expansion? 

To answer this question, one can consider for n = 3 
known locally homogeneous vacuum solutions of 
Einstein’s equations which spatially compactify to 
manifolds of negative Yamabe type. Applying the 
theory of Hamiltonian reduction to these classical 
models, one can show that the reduced Hamiltonian 
behaves as expected under the reduced Einstein flow 
defined by these models. Since these models existed 
long before this theory, it is somewhat satisfying to see 
that they can be interpreted in terms of Hamiltonian 
reduction and how, with this interpretation, new 
properties of these classical solutions can be found. 

Since Hyeduced is a strictly monotonically decreasing 
function along nonconstant integral curves of the 
reduced Einstein flow, it is expected that under certain 
conditions, the reduced Hamiltonian is monotonically 
seeking to decay to its infimum. Thus, it is of interest to 
look at Hamiltonian reduction under the consequence 
of the following two assumptions: 


1. The reduced Einstein field equations give rise 
to the existence of a positive semiglobal non- 
constant solution (7(t),p''(t)) defined for all 
t € (0,00) (or equivalently, for all 7 € (一 oo,0); 

2. The reduced Hamiltonian strictly monotonically 
decays to its infimum along nonconstant integral 
curves, 


Fleednced (T(t), y(t), p. (t) 


—o inf H educed [56] 


as f —> oo 


From [40] and [51], in terms of the physical 
variables (g,7) (or (g,k)), [56] can be written 
equivalently as 


3/2 
-T° vol(M, g) =—(trgk)” vol(M,g) — (5 -o(M)) 


[57] 


ast— oo 
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As a consequence of these assumptions, it follows 
from [53] that the conformal volume vol(M,») must 
also decay to its infimum [54] (although not 
necessarily monotonically), 


vol(M, 4(t)) — in vol(M, 7) 


(-e(M))?. as t — oo 


[58] 


Now suppose that o(M)=0. A large class of 
manifolds for which this is true are the graph 
manifolds (and thus also the Seifert manifolds) of 
negative Yamabe type since o(M)> 0 for graph 
manifolds in general and since o(M) < 0 for mani- 
folds of negative Yamabe type. In this case the curve 
^(t) € Mı of conformal metrics must necessarily 
(conformally) volume collapse M in the direction of 
cosmological expansion, 


vol(M,^(t)) — (—o(M))?? 20 ast—»oo [59] 


Consequently, the curve of conformal metrics y(t) 
must undergo some form of degeneration as its 
volume collapses. The details of this metric degen- 
eration are of importance and are discussed below. 

Not all locally homogeneous vacuum Bianchi 
models admit spatially compact quotients. Fortu- 
nately, the general theory of which Bianchi models 
admit spatially compact quotients has been worked 
out in detail by Tanimoto, Koike, and Hosoya (see 
Tanimoto et al. (1997) and the references therein). 
These Bianchi models together with their corre- 
sponding Thurston classification and typical exam- 
ples of their closed. quotient manifolds are listed in 
Table 1, where “K-S” indicates *Kantowski-Sachs," 
*P," *Z" and *N" denote manifolds of Yamabe 
type positive, zero, and negative, respectively (see 
the section “Topological Background"), “Seifert” 
means Seifert fibered, “Hyper” means hyperboliz- 
able, “2” indicates “unknown, but conjectured to be 
so,” and “manifold collapse” denotes the type of 
collapse that the conformal manifold (M,~+(t)) goes 
through as the conformal volume vol(M,^(t)) 
collapses. We also remark that all of the manifolds 


Table 1 Bianchi, Thurston, and Yamabe type of a connected closed oriented irreducible 3-manifold 


Bianchi type Thurston type Typical examples 


K-S S? x R Suas 

IX S? Nontrivial S'-bundles over S? 

| R? T? 

r Nil Nontrivial S'-bundles over T? 
I H? x R Xx S',p>2 

Vill SL(2, R) Nontrivial S'-bundles over £? 
Vlo Sol Nontrivial T?-bundles over S' 
V, Vil, H? Closed hyperbolizable manifolds 


Yamabe type -constant Manifold structure Manifold collapse 


ccu N UT 


>0 Seifert 
>0 Seifert 
0 Seifert 
0 Seifert Total 
0 Seifert Pancake 
0 Seifert Pancake 
0 Graph Barrel 
<0? Hyper None 
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listed in the “Typical examples” column are irredu- 
cible with the exception of S! x S^, which is prime 
but not irreducible. Also, in this column, p > 2. 

In this table, the eight Thurston types are grouped 
into three sets according to their Yamabe type. The 
first set of such Bianchi models are those that 
spatially compactify to yield 3-manifolds of positive 
Yamabe type which allow metrics with positive 
constant scalar curvature, for example, Bianchi IX 
models defined over spherical space forms. The 
second set (consisting of one type) yields manifolds 
of zero Yamabe type which allow zero scalar 
curvature metrics but not constant positive scalar 
curvature metrics, for example, Bianchi I models 
defined over T? or one of the other five manifolds of 
flat type finitely covered by T^. The third set (the 
last five entries in Table 1 and the set of most 
interest in this article) yields manifolds of negative 
Yamabe type which do not allow metrics with zero 
scalar curvature. 

These latter models are the five Bianchi models of 
types II, III, VIII, Vlg, and V (and in part VII), 
which in turn correspond in Thurston's classification 
to manifolds of type Nil, H? x R, SL(2, R), Sol, and 
H?, respectively. 

In the first three cases, the models of Bianchi type 
II, III, and VIII compactify to a nontrivial S'-bundle 
m T? or to a trivial or nontrivial S'-bundle over 

,p > 2, respectively. Each of these spaces is Seifert 
"diim In the fourth case, the model of Bianchi type 
VIo compactifies to a nontrivial T^-bundle over S$! 
which is an irreducible graph manifold. Since each 
of these manifolds is also of negative Yamabe type, 
in each of these four cases, as discussed in the 
beginning of this section, c(M) — 0. In the fifth case, 
we consider vacuum Bianchi V metrics as well 
as a special case of Bianchi type VII, which 
compactify to an arbitrary closed oriented hyperbo- 
lizable manifold M. 

For these latter five Bianchi models that spatially 
compactify to manifolds of negative Yamabe type, 
one can consider the ‘classical solutions from 
the point of view of Hamiltonian reduction. The 
starting point for this point of view is to use 
explicitly known vacuum metrics for the 
simplest “standard” metric forms, given, for exam- 
ple, in Wainwright and Ellis (1997). One need not 
consider all such possible spatially compact quoti- 
ents, even though that would appear to be quite 
feasible, but one need only consider some represen- 
tative examples for each of the Bianchi types listed. 

It can be shown by explicit calculation, using the 
known solutions, that in the four nonhyperbolizable 
cases where o(M)=0, each of the classical Bianchi 
solutions gives rise to the existence of a positive 


semiglobal nonconstant solution to the reduced Einstein 
field equations and that along this solution, the reduced 
Hamiltonian asymptotically approaches 0 under the 
reduced Einstein flow, thereby confirming the expecta- 
tion that the reduced Hamiltonian asymptotically 
approaches its infimum ((3/2)(—o(M)))>/ *—(. Thus 
in these cases the reduced Einstein flow conformally 
volume-collapses the 3-manifold. 

The explicit calculations also show the details of 
this collapse. In the second and third models of 
Bianchi type II, Thurston type H° xR, and 
Bianchi type VIII, Thurston type SL(2, R), respec- 
tively, the conformal metric degenerates along 
embedded circular fibers and this metric degenera- 
tion causes M to collapse to its base manifold X2, 
p 2 2. Since the collapse is along one-dimensional 
fibers and since the two-dimensional base mani- 
fold E does not collapse, we refer to this type of 
collapse as pancake collapse (see Figure 2). 

In the fourth model of Bianchi type VIo, Thurston 
type Sol, the conformal metric degenerates along 
embedded T*-fibers and this metric degeneration 
causes M to collapse to its base manifold S'. Since 
the collapse is along two-dimensional fibers and 
since the one-dimensional base manifold S! does not 
collapse, we refer to this type of collapse as barrel 
collapse (see Figure 3). 

In the first model of Bianchi type II, Thurston type 
Nil, as in the second and third models, the 
conformal metric degenerates along embedded cir- 
cular fibers. Additionally, not only do the circular 
fibers collapse but simultaneously the flat quotient 
2-torus base manifold T^ ~ M/S' of M modulo its 
circular fibers also collapses. Thus the metric 
degeneration collapses M to a point, exhibiting a 


Figure 2 Bianchi Ill, Thurston type H? x R,M= 2 x S', 
pancake collapses to E ,P=2. The conformal geometry starts 
with an infinite S' fiber at the big bang (t=0*) and pancake 
collapses with bounded curvature to X5 at infinite cosmological 
expansion (f — oo). 


tz0* 


Figure 3 Bianchi Vlo, Thurston type Sol, nontrivial T?-bundle 
over S', barrel collapses to S'. The conformal geometry evolves 
from a base manifold S' at the big bang (t=0*). Instanta- 
neously after the big bang, flat T?-fibers bloom out of the 
collapsed S' state. The conformal metric then expands to a 
maximum volume and then barrel collapses with bounded 
curvature back to the base manifold S' at infinite physical 
cosmological expansion (t — oc). The two facial 2-tori are flat 
and are glued together by an orientation-reversing toral 
automorphism so as to give a nontrivial T?-bundle over S'. 
The gray-scale density grading along the tube also indicates the 
nontriviality of the bundle. 


case of total collapse. Thus these model universes 
provide examples of nonflat almost-flat manifolds 
that exhibit total collapse with bounded curvature. 
Since the conformal geometries of these model 
universes collapse to a point, they aptly deserve 
their name Nil (see Figure 4). 

Remarkably, in each of these four cases of 
collapse, the collapse occurs with bounded curva- 
ture, precisely as occurs in the totally different 
setting of the Cheeger-Gromov theory of collapsing 
Riemannian manifolds, recognized many years ago 
to be of importance in the understanding of the 
behavior of sequences of metrics with uniform 
curvature bound (see Gromov (1999) for references 
and Anderson (2004) for other applications of 
Cheeger-Gromov theory to general relativity). 
What is somewhat remarkable is that the above 
cosmological models were constructed completely 
independently of that setting and thus provide 
naturally occurring cosmological models whose 
closed spatial hypersurfaces undergo conformal 
volume collapse and metric degeneration exactly as 
occurs in the theory of collapsing Riemannian 
manifolds. 

Of course, this volume collapse and metric 
degeneration only occur as described in the con- 
formal variables. The physical variables behave 
differently. Indeed, in contrast to the conformal 
volume which collapses to zero in the first four cases 
and is constant in the hyperbolizable case (see 
below), the volume of the physical metric in all 
five cases goes to infinity since the flow is 
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t=OF t= oo 


Figure 4 Bianchi ll, Thurston type Nil, nontrivial S'-bundle 
over TÊ, totally collapses to a point. The conformal geometry 
evolves from a point at the big bang (t=0*). Instantaneously 
after the big bang, the full 3-manifold blooms out from that point. 
The conformal geometry then evolves to a metric of maximum 
volume and then totally collapses with bounded curvature back 
to a point at infinite physical cosmological expansion (t — oc). 
The two 2-tori, represented here by doughnuts, are flat and 
are glued together by an orientation-reversing toral automorph- 
ism so as to give a nontrivial S'-bundle over T°. 


temporally oriented in the direction of infinite 
cosmological expansion. 

In the fifth case where M is hyperbolizable, o(M) 
is conjectured to be negative and to be determined 
by the hyperbolic volume, e(M) = —(vol(M, ~p), 
of the hyperbolic conformal metric yp normalized so 
that R(%,)=—1. In this case, yp together with 
p''=0 is a fixed point for the reduced Einstein 
flow so that trivially the conformal volume does 
not collapse. Moreover, if c(M) is determined by 
the volume of yp, then the constant reduced 
Hamiltonian also trivially achieves its infimum 
H educed(T, ^), 0) = (3/2) (-0(M))) ^ =(3/2)°* (vol 
(M,*,)) again confirming the expectation for the 
behavior of Hyeduceq on these Bianchi models. 

Note that for this static case, the physical 
variables behave as described after [46] and as 
shown in Figure 1. Also note that in contrast to 
Figures 2-4 where the conformal geometry is 
depicted, Figure 1 depicts the physical geometry. 

Overall, in all five cases, subject in the hyper- 
bolizable case to a hyperbolic metric realizing the 
o-constant, the reduced Hamiltonian asymptotically 
approaches its o-constant infimum along the flow 
lines of the reduced Einstein system. In doing so, the 
volumes of the conformal metrics either go to zero 
(in the first four cases) or to the hyperbolic volume 
(in the hyperbolic case). In all five cases, the 
curvature of the conformal metrics is uniformly 


bounded. 
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Because the reduced Einstein field equations behave 
as expected for the Bianchi models that we have 
considered with spatially compactified manifolds 
being either Seifert fibered, graph, or hyperbolizable, 
it seems plausible that for a more complicated starting 
manifold M, the reduced Einstein flow may induce a 
decomposition of M into geometric pieces. Indeed, 
Anderson’s conjectures (Anderson 1997) predict how 
a sequence of geometries with bounded curvature 
approaching o(M) degenerate. Assuming these con- 
jectures, the asymptotic behavior of large classes of 
Einstein spacetimes may perhaps be characterized 
rather explicitly in terms of the geometrization 
program of 3-manifolds (see the next section). 

Conversely, it is conceivable that the damped 
hyperbolic system of equations defined by the reduced 
Einstein flow (with its strictly monotonically decreas- 
ing reduced Hamiltonian on nonconstant curves) 
could be used to try to establish some form of the 
geometrization conjectures for 3-manifolds, much like 
the parabolic system of equations defined by Ricci 
flow is currently being used. If such a program were to 
be successful, it would amount to a spectacular 
consequence of Einstein’s equations, implying as it 
does that geometrization may actually occur in nature. 


Possible Cosmological Applications 
of the Reduced Hamiltonian 


Astronomical observations strongly support the view 
that in a sufficiently coarse-grained sense, the universe 
is homogeneous and isotropic. Furthermore, it is 
expanding at such a rate, relative to its observable 
energy density, that it will continue to expand forever. 
The simplest cosmological model consistent with these 
properties and which has a vacuum limit is the k = — 1 
FLRW model. Spatially compactified variants of this 
model are still locally homogeneous and isotropic even 
though they are no longer globally so (see the 
discussion after [46]). Evidence for one or another of 
the infinitely many compactifications possible could be 
sought in patterns of fluctuations of the cosmic 
microwave background radiation and the detection 
of such patterns could be strong evidence for a 
spatially closed universe. e 

However, is one really justified in extrapolating 
local observations of that portion of the universe 
visible to astronomers to a conclusion about its 
global topology? Could it be instead that there is a 
dynamical reason, provided by Einstein’s equations, 
for the observed fact that the universe seems to be 
locally homogeneous and isotropic and in such a 
state as to continue expanding forever? 

Suppose for the sake of argument that the 
universe has a more complicated topology, such as 


that of one of the generic K(m, 1)-manifolds which 
does not admit a locally homogeneous and isotropic 
metric even though its hyperbolizable components 
would each individually do so. A plausible scenario 
suggested by the results in this article is that under 
the Einstein evolution, the reduced Hamiltonian given 
by [40] consisting of the rescaled spatial volume 
becomes asymptotically dominated in the future 
direction of cosmological expansion by the contribu- 
tion of the hyperbolizable components. On each of 
these components, the limiting conformal metric 
approaches local homogeneity and isotropy with the 
relative contribution of the graph-manifold constitu- 
ents, if any are present, collapsing asymptotically to a 
negligible fraction of the whole. The idea is that if 
structure formation develops sufficiently late in the 
evolution of such a universe, then it should occur, with 
overwhelmingly high probability, in those regions 
which dominate the conformal volume and admit an 
asymptotically locally homogeneous and isotropic 
metric of constant negative curvature, locally indis- 
tinguishable from a k = —1 FLRW model. 

One can speculate still further and imagine what 
happens if the spatial topology is not of prime type 
but rather consists of a connected sum of several 
K(m,1)’s together perhaps with nontrivial spherical 
manifolds S?/I" and handles S' x S^. Here it seems 
conceivable, especially in view of the expected 
tendency of spherical manifolds to “recollapse,” that 
the evolving universe would develop pinch-off singu- 
larities along the essential 2-spheres that separate the 
individual prime factors. Such singularities might 
occur in finite time between connected sums of 
spherical recollapsing factors or in infinite time 
between connected sums of K(m, 1)-factors. Similar 
patterns of singularity formation are seen to occur in 
Ricci flow and must be treated in the resolution of 
the 3-manifold geometrization program. 

Of course there is no proof of such behavior for the 
full (3 + 1)-dimensional Einstein gravity but for the 
model problem of Einstein's theory in (2 + 1) dimen- 
sions, something close to a proof of the analogous 
conjecture is already at hand. In the vacuum case, 
which can be described rather explicitly, one can 
construct the generic solution for a higher genus 
surface topology by cutting open the corresponding 
k— —1 FLRW model and gluing in the so-called 
Kazner wedges. These wedges play the role of the 
graph-manifold constituents of a generic K(m,1)- 
manifold in three dimensions and evolve anisotropi- 
cally. However, it is known rigorously in this case that 
the rescaled spatial area Hoguced(T, Yp) = 
(— r)” Area(X2, g) is asymptotically exhausted by the 
FLRW components with the contribution from the flat 
Kazner anisotropic pieces shrinking to zero in this 


limit. If certain types of matter sources are included, 
for example, those analogous to terms which result 
from Kaluza—Klein reduction of vacuum gravity in 
(3 + 1)-dimensions, then a similar result can be proved 
at least for sufficiently small but fully nonlinear 
perturbations away from the vacuum backgrounds 
(see Choquet-Bruhat (2004)). 

In fully general (3 + 1)-dimensional gravity, there 
are few known topologically general results beyond 
those mentioned earlier and the problem is compli- 
cated by the presence of gravitational waves (which 
are absent in (2+ 1) dimensions) and the fact that 
on such more general manifolds, there are no known 
“background” solutions to perturb about. However, 
for the special case of (future) vacuum evolution on 
a pure closed hyperbolizable manifold, one can show 
that if the initial data is sufficiently close to that of an 
FLRW model, then the fully nonlinear gravitational 
perturbations eventually die out leaving a locally 
homogeneous and isotropic model in the asymptotic 
limit (see Andersson and Moncrief (2004)). It seems 
likely that this result can be generalized to allow for 
the inclusion of various types of matter sources as in 
the (2 + 1)-dimensional case. 
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Introduction 


In the study of differential systems, and particularly 
of Hamiltonian differential equations, a fundamen- 
tal problem is the question of their integrability. 
Because there are different definitions of this notion, 
a system which is integrable according to one 
definition can be nonintegrable according to another 
one. The notion of integrability is connected to the 
existence of a sufficiently large number of first 
integrals, which are linked to conservation laws. For 
a real analytic Hamiltonian system with n degrees of 
freedom, the *complete integrability" means the 
existence of n first integrals, which are functionally 
independent, and “in involution,” in the entire phase 
space. These integrals can be functions of class C" 
(r finite), C*, or analytic. 

For the classical problems of Hamiltonian 
mechanics which are integrable, their first integrals 
can be continued into the complex domain of the 
variables, as one-valued holomorphic, or mero- 
morphic, functions of complex time. This fact leads 
to the concept of “complex integrability” of a 
system. Note that a real Hamiltonian system which 
is integrable may be nonintegrable in the complex 
domain, if the real first integrals cannot be con- 
tinued as one-valued holomorphic functions of the 
complex time. 

Generally, the branching of solutions of a system, 
as functions of complex time, is an obstruction to 
the existence of one-valued first integrals. To study 
this problem, one can, following Poincaré, expand 
the solutions in convergent series of a small 
parameter: this is the base of “perturbation meth- 
ods," and the main fact is that a small perturbation 
of an integrable Hamiltonian system generally 
destroys its integrability. Another method of proving 
nonintegrability consists of studying the linearized 
equations along a particular solution. This last 
direction has been exploited recently, in particular, 
through methods based on algebraic results inspired 
by differential Galois theory. 


Hamiltonian Systems and Mechanics 


Let us consider a conservative holonomic real 
dynamical system with n degrees of freedom: the 
positions of this system are points of an 
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n-dimensional real manifold N (the state space or 
configuration space) with local coordinates 
X1,X2,...,Xn. If the velocities are denoted by 
x; = dx;/dt, we consider the Lagrangian function L 
associated to this system: 


List: y No shee 


= 六 T V3. n] 


where x—(X4,...,X,) is a tangent vector to the 
manifold N at the point x = (x1,...,x,). The kinetic 
energy T(x, X) is a positive-definite quadratic form in 
X1,...,X,4, and V(x)-is the potential energy, whose 
gradient determines the forces acting on the system. 

The motions (x1(£),x2(t), ...,x,(t)) of the system 
on the manifold N are the extremals of the action 
integral: IP Lixas es i adt (“principle 
of stationary action of Hamilton”) and they are 
the solutions of the Euler-Lagrange system, which 
consists in differential equations of second order 
for the coordinates x1,X2,...,X, (Whittaker 1904): 


d /OL\ OL | D 
dt (S) Ox; LT 

This system can be written in the Hamiltonian 
form: the Lagrangian L is a function defined on the 
tangent bundle TN of the state space N, with local 
coordinates x1,...,X4, X1,..., Xn (Le., an element of 
TN consists in a point x of N, joint with a tangent 
vector to N at x). Now, we consider the cotangent 
bundle T*N: an element of T*N consists in a point x 
of N joint with a cotangent vector to N at x, that is, a 
linear form defined in the tangent space to N at x. In 
local coordinates, the components of this linear form 
ar6 yu, defined by: m=0L ON ;34,..., 4 are 
called the generalized momenta, or impulsions. x; 
and y; are called conjugate canonical variables. 

The mapping from TN to T*N thus defined is the 
Legendre transformation (Abraham and Marsden 
1967). Through it, the Euler-Lagrange equations 
become a system of 2 differential equations of first 
order: 


erm 


dx; OH dy; OH 


a M. nic M ee 1<i<n 

dt OY; dt Ox; at mn 
where Hs, os « 5 ens Mag os Yum DUX ss sa Xe 
Y3a 223 Vo) 0 VUE s- « 3) 


H is the Hamiltonian function of this system. The 
solutions of these differential equations are curves 
on the 2m-dimensional manifold T*N, whose projec- 
tions in the n-dimensional state manifold N coincide 
with the solutions of the Lagrangian system. T*N is 


Hamiltonian Systems: Obstructions to Integrability 625 


called the phase space of the system. The second 
members of the differential system define a vector 
field in the phase space. 

Let M=T*N. On this 2z-dimensional manifold, 
consider the standard symplectic form Q= 
$5 dy; ^ dx;. If f and g are C*-functions on M, 
we define their Poisson bracket {f,g} in local 
coordinates by 


(Of dg Of ög 
(6. — Y (SLE Bde) 


It defines the space C*(M) as a Lie algebra over R. 

Then, if H € C*(M) is the Hamiltonian function 
associated to a system, the corresponding Hamilto- 
nian equations can be written as the following 27 
“canonical equations" (Arnol'd 1976): 


A function F € C* (M) is a (first) integral of eqns [1] 
if it is constant along any solution of [1], that is, if it 
verifies: {F, H} = 0. Thus, a first integral is a quantity 
which is preserved along a solution (“conservation 
law"). In particular, H itself is a first integral of the eqns 
[1]. It represents the “total energy" of the system. 


1. The simplest example of Hamiltonian system is 
the harmonic oscillator defined by the one degree of 
freedom Hamiltonian: 


H(x,y) =4y* + 4x? 


It possesses the energy integral H. Thus, the trajectories 
in the phase space R? (phase plane) are given by 
x? +y*=2h, which are concentric circles if the 
constant energy verifies h > 0. The phase space R? is 
foliated by these circles. The system is said to be 
"integrable." Obviously, it is also possible to construct 
Hamiltonian systems with n degrees of freedom 
(n > 1), by coupling n harmonic oscillators, with a 
Hamiltonian defined by 


1 n ; n 
Xn Yn) 75 2 V + D Lai 
£z] 一 ] 


with n constant coefficients a; 0. 

2. Another example of Hamiltonian system with one 
degree of freedom is the simple mathematical pendu- 
lum. The state coordinate is the angle 0 of the 
pendulum with the vertical axis, defined modulo 27. 
The phase space is: M=S! x R (x=@(mod2z) € 
S', y € R)thatis, a cylinder. The Hamiltonian function 
is: H(x, y) =(1/2)y* — cos x; H isa first integral of the 
differential equations, the system is integrable and the 
trajectories on the cylinder S! x R are defined by 


His... 


(1/2)y? — cos x — b. According to the constant value 
of h on each phase curve, the solutions are periodic 
oscillations of the pendulum (if b < ho), periodic 
solutions of rotation where the angle varies mono- 
tonically with time (if b > ho), two equilibria (one 
stable, one unstable) and solutions which “begin” 
when t 一 —oo at the unstable equilibrium and “finish” 
when f — +o at the same point (if b= bo): the 
corresponding phase curves are called *separatrices." 
3. The system of Hénon-Heiles (Hénon and Heiles 
1964) is a system with two degrees of freedom. The 
phase space is R? x R? and the Hamiltonian is 


defined by 


H(x1, x2, Y1, y2) 
= 1 (y4? = y2*) +4 (x47 十 x27) + xi^x 一 x2? 


where À is a real constant. This system is “integrable” 
for some isolated values of the parameter A (Ziglin 
1983) and “nonintegrable” otherwise. Of course, it is 
necessary to define the integrability of a Hamiltonian 
system, although according to Poincaré: “A system of 
differential equations is only more or less integrable.” 


integrability of Hamiltonian Systems 


Generally, if a differential system is of order p, it is 
necessary to know p first integrals to integrate it. But if 
the system is Hamiltonian of order 2”, only n first 
integrals are sufficient to integrate it *by quadratures," 
that is, by *algebraic" operations such as integrations 
and inverting of functions. The reason is that the 
existence of one first integral allows us to reduce the 
order of the system by two: a system of order 2 with 
one first integral can be reduced to order 2n — 2. 


Theorem of Liouville (see Arnol'd (1976)). Sup- 
pose that Fi, F»,..., F, € C*(M) are n first integrals 
of the Hamiltonian system [1] which are “in 
involution,” that is, such that: {F;, F;] 2 0, Vi,j, and 
suppose that they are functionally independent, that 
is, the n differentials, dF;, are linearly independent at 
each point of the level set M; defined by 


M; = Me eee e M: REl; cos piye) 


(i) the set My is a manifold which is invariant 
along the solutions of the system [1]; 

(ii) if My is compact and connected, it is diffeo- 
morphic to an n-dimensional torus 


T" 28 x 8! xo KF) 
= (qi... Yn) : pi € R/2nZ}; 
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(iii) the Hamiltonian flow on each torus My is linear 
and “quasiperiodic” with frequencies w; defined 
by dp;/dt — wifi fr, $$ sfa) and 

(iv) the Hamiltonian equations are integrable by 
quadratures. 


If a Hamiltonian verifies the assumptions of the 
theorem of Liouville, one can prove that it exists, 
locally, canonical coordinates Ø — (w1,...,«,) and 
I —(Lh,...,I,) such that the Hamiltonian function 
depends only on the variables 1;. Then 


dy; OH 

dt A; 
di OH | 
dt Ov; 


These equations are immediately integrated as 
follows: 


I;=constant, and y; —w; t + y;(0), with 


NEL. 


wi(T1,15, .. al. 


I—cst 


Such local coordinates (y;,1;) are called “action- 
angle" variables. They were defined for the first time 
by Delaunay and they play an important part in the 
theory of perturbations. 


Remark An invariant torus T" of the theorem of 
Liouville is characterized by the constant values of the 
actions 1;, which determine the frequencies w; on it. 
Such a torus is said to be nonresonant if the relation 
between the frequencies wi: > , kjw;=0 (where 
ki,...,k, are integers) implies that &;— 0, Vi. The 
frequencies w; are then rationally independent. If a 
torus is nonresonant, the phase trajectories are dense 
everywhere and the motion is quasiperiodic on it. 


A torus is said to be resonant, if the frequencies w; 
are rationally dependent: they verify a relation 
uerum, with (ki. -ka = (0,....,0). Then 
the phase trajectories are not dense on the torus; 
they belong to tori of lower dimension. 

A consequence of the theorem of Liouville is that, 
if a two-degree-of-freedom Hamiltonian system 
possesses one first integral F (in addition to H, and 
independent of H), it is integrable because F is 
necessarily in involution with H: {F,H}=0. 

An example of system with three degrees of 
freedom which is integrable is the Lagrangian 
symmetric top with one fixed point (there exists a 
cylindric symmetry for the inertia momenta and the 
center of mass is on the symmetry axis). This system 
possesses three first integrals that are in involution 
and independent: H, and the angular momenta M, 
and M3, which correspond to the (constant) 


frequencies of precession and nutation of the top. 
The level sets M; are here tori of dimension 3, which 
are indexed by the three frequencies (or by the 
constant values of the three integrals). 

There are other integrable cases for this problem 
of a rigid body with a fixed point (see Kozlov 
(1983)): the Euler's case (when the fixed point is the 
center of mass); the Kowalevskaya's case (in which 
the inertia momenta verify two relations and the 
third coordinate of the center of mass vanishes — see 
Kowalevski (1889)); and the Goryachev-Chaplygin's 
case, which is integrable only on a single integral 
level. 

A fundamental and classical example of integrable 
Hamiltonian system is the Kepler's problem: the 
motion of a ponctual mass in the gravitational 
(Newtonian) field of a center, for instance, a planet 
in the field of attraction of the Sun. 

Another example is the problem of two fixed centers: 
an infinitesimal mass in the field of two centers, problem 
which was integrated by Lagrange (Lagrange, 1810). 


Isolated Periodic Orbits 
and Nonintegrability 


We consider a real Hamiltonian system with n 
degrees of freedom and we suppose that there exists 
a particular T-periodic solution Tr (which is not an 
equilibrium). Along Tr, we consider the linearized 
equations deduced from the Hamiltonian system. 
They can be decoupled into the tangential equation 
(one degree of freedom) which possesses the first 
integral dH and the normal variational system which 
can be written as 


== J-K(Ur(e))-€ p 


where 


is the standard symplectic matrix of order 2(m — 1) 
and K(LT(t) is a T-periodic matrix depending on 
the solution TIT. 

The solutions of the linear system [2] form a 
vector space. As a definition, the monodromy 
matrix M(T) expresses how fundamental solutions 
of the linear system [2] are transformed after one 
period T, that is, along the periodic closed orbit I7: 


&(t + T) = M(T) - &(t) 


Poincaré showed that if one of the eigenvalues of 
M(T) is different from 1, then the periodic solution 
LT is isolated. Furthermore, if the number of first 
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integrals of the Hamiltonian system, independent 
along l7, is equal to k, then at least 2k eigenvalues 
of M(T) are equal to 1. 


Theorem (Poincaré 1892). If tbe Hamiltonian 
system possesses n integrals in involution, and 
independent along a periodic solution Vr, then Vr 
is nonisolated. 


Then, if the Hamiltonian system possesses a dense 
set of isolated periodic orbits, it cannot have n 
integrals in involution and independent in an open 
domain. 


Nearly Integrable Hamiltonian Systems, 
Theorem of Poincaré 


Consider the Hamiltonian system with n degrees of 
freedom, depending on a small real parameter 
€ € (—£9, +£0), defined by the analytic function H: 


H(g,I,c) = Ho(I) + € - Ai (9,1) [3] 


where 9 —(v1,...,95) € T"^,I-—(1,...,1,) € R”, and 
where H; is periodic in the angles yj. 

This system is called *nearly integrable" because 
when e — 0, the “unperturbed system" Ho is integr- 
able in the action-angle variables ø, I: 


H(g, 1,0) = Ho(I) 
then 
dl x dø = OHo 2 
d uoa 5 


system which can be integrated by quadratures: 
I=I° and p= 9°+a@(I°)-t 


According to the theorem of Liouville, the motion of 
the unperturbed problem takes place on z-dimen- 
sional tori (S')” in the phase space. On these 
invariant tori, indexed by the actions I, the motion 
is generally quasiperiodic (if the frequencies @(I) are 
rationally independent). 

We are now interested in studying the perturbed 
system [3] with e 4 0, and its integrability which is, 
according to Poincaré (1892), “the fundamental 
problem of dynamics." This problem of nearly 
integrable Hamiltonian systems is directly inspired 
by celestial mechanics where the motions in the 
solar system are, in a first approximation, described 
by the (integrable) Keplers problem. In particular, 
the "restricted three-body problem" is the study of the 
motion of a planet in the gravitational field of the Sun, 
with the perturbative attraction of Jupiter. It is also the 
problem of the Moon in the field of the Earth, with 
the perturbative attraction of the Sun (Poincaré 1892). 


Theorem of Poincaré (Poincaré 1892). Assume 
that, in the Hamiltonian function |3]: 
(i) (nondegeneracy condition) the unperturbed 


Hamiltonian Ho is nondegenerate, that is, 


Ow; 
ae = der or ^o 


in an open domain of the phase space; 
(i) (genericity condition) no coefficient b,(I) in the 
Fourier expansion of H1 with 


Hi(g,I) = M h(l) e" 
kez" 


does identically vanish in the nonresonant 
domain G € R” of the actions defined by 


Ga rem: yk -wi(I = 
ona 


then, there is no analytic first integral F(Q,I, €) 
independent of the Hamiltonian function H. 


Thus, a perturbation of a nondegenerate integrable 
Hamiltonian system is generically nonintegrable. 

When one wants to apply this theorem to celestial 
mechanics, a peculiarity is that the unperturbed 
problem corresponds to the Keplerian system, which 
is degenerate, and this is a specific difficulty of these 
systems. 


Splitting of Separatrices and 
Nonintegrability 


Consider a Hamiltonian system with 7» — 2. (degrees 
of freedom) defined as in eqn [3] by a perturbation 
of an integrable Hamiltonian: 


H (i 9251; I ,€) 
= Ho(l , D) +e - Ai (41, 92,41, 12) [4] 


The unperturbed problem is integrable and its four- 
dimensional phase space is foliated by two-dimensional 
invariant tori T? : I = constant. If Ho is nondegenerate, 
the nonresonant tori are dense and the resonant tori also 
are dense in the phase space. 

According to Kolmogorov’s theorem and 
the Kolomogorov—Arnol’d—Moser (KAM) theory 
(Arnol’d 1985), the majority of the nonresonant 
tori of the unperturbed problem Ho are preserved 
in the full problem [4]: they are slightly 
deformed, and are invariant in the perturbed 
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system. The resonant tori of Ho are destroyed in 
the perturbed problem. 

Now we consider, in the phase space, a transverse 
surface S to the invariant tori T? of the perturbed 
system. A trajectory of the system generated by [4], 
which crosses S through a point wo, will cross S 
again, for the first time, through a point wy: this 
defines the “first return map" or “Poincaré’s map” 
R:wo-R(wo)-—w1.S is called a Poincaré's sec- 
tion. If wo belongs to a preserved invariant torus 
of the perturbed system, the successive points 
49,1 = R(wo), w = R(w1), ws = R(w2),... belong 
to the intersection of this torus with $; thus, they 
belong to a curve diffeomorphic to a circle, which is 
an invariant curve of the map R. If wo does not 
belong to a preserved invariant torus of [4], the 
sequence of points wo, W1, W2,... through the 
Poincarés map belongs to a curve much more 
complicated than a curve diffeomorphic to a 
circle (Poincaré 1890, Arnold 1985) and the 
"chaotic" behavior of this sequence is the mark of 
the nonintegrability of the system [4]. 

The best way of numerically showing the 
"evidence" of nonintegrability is to study the 
example of a system with “one and a half” degree 
of freedom, that is, a system with one degree of 
freedom whose Hamiltonian depends on time: 
H(y,I,t). An example of such a system is the 
problem of a mathematical pendulum whose length 
| performs periodic oscillations, defined by the 
Hamiltonian function 

p a 
H(p, p.t) => —w'(1-e-f(t)): cose [5] 


where y € S!, p € R, and f is periodic of period T. 

The unperturbed system (e= 0) is integrable (one 
degree of freedom with a Hamiltonian independent 
of t): 

P oa 
Holy, p) WwW “Cosw 

The phase portrait of this problem is similar to the 
one of the simple mathematical pendulum of 


constant length: on the cylinder St x R there are. 


two equilibria (stable and unstable) and separatrices 
“beginning” and “finishing” at the hyperbolic point 
y. The invariant stable and unstable manifolds 
associated to x and represented by these separatrices 
were called by Poincaré as *homoclinic" trajectories, 
because each of them, drawn on the phase cylinder, 
joins equilibrium x to itself. 

If € 40, we define a Poincaré section of the 
perturbed system [4] in the following way: from an 
initial point wo(~o,Po,to), we consider the successive 


planes perpendicular to the t-axis in the “extended” 
phase space {(y,p,t)}, defined by:to,t1—to 4- T, 
ty=to +2T, t3=t0+3T,... and we look at the 
successive intersections of the orbit of wo with 
these planes: :9,1),105,.... If we identify all the 
successive planes and if we draw on the same 
picture, the points wo,w1,W2,..., we obtain a 
phase portrait in which the equilibria of the 
unperturbed problem are present, but the separatrix 
which “leaves” the point x is not confounded with 
the separatrix which “ends” at y, as in the 
unperturbed problem: the two invariant curves are 
transversal to each other: they “split” and have an 
infinite number of intersections. This splitting is the 
traduction of the nonintegrability of the perturbed 
system [5]. 

A method to detect this splitting of separatrices 
consists in computing the Melnikov's function which 
gives a measure of the angle between the separatrices 
at their first intersection when they split. 

Many concrete Hamiltonian systems have been 
studied by this method and numerical investigations 
on the splitting have permitted detection of their 
nonintegrability. 


Topological Obstructions to Integrability 


We are interested in a natural mechanical system 
with two degrees of freedom and we suppose that 
the state space N is a real analytic surface which is 
compact and orientable. Then, N consists of a two- 
dimensional sphere with k handles (or a torus with k 
holes). The number k is a topological invariant of 
the surface and is called the genus of N. 

Let H be the Hamiltonian function associated to 
this problem. The Hamiltonian system possesses the 
first integral H. It is completely integrable if and 
only if another analytic integral F exists, function- 
ally independent of H. In this case, the state space N 
belongs necessarily to a very restrictive class of 
surfaces. 


Theorem (Kozlov 1983). If the genus k of the 
state manifold N is not equal to 0 or 1 (i.e., if N is 
neither diffeomorphic to the sphere S* nor to the torus 
T*), then the Hamiltonian system generated by H does 
not possess a first integral, analytic on T*N and 
functionally independent of the energy integral H. 


Note that this theorem does not apply to first 
integrals which are C^ only, and examples can be 
given which illustrate this case (Kozlov 1983, 1989). 

For systems with more than two degrees of 
freedom, an open question is to know whether the 
complete integrability imposes restrictions to the 
topology of the state manifold N. 


— ————-—Ó————BÉEÉ —— 
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Singular Point Analysis, Branching 
of Solutions and Ziglin’s Theory 


If we look at the classical Hamiltonian problems 
which have been integrated, their first integrals are 
real functions which can be continued in the 
complex domain as one-valued holomorphic or 
meromorphic functions of the complex time t 
(polynomials, rational functions, etc.). This fact 
leads to the concept of “complex integrability.” 
But the nonintegrability of a complex Hamiltonian 
system does not imply the nonintegrability of its 
restriction to the real domain: it may happen that a 
real analytic first integral does not possess a 
continuation in the complex domain as a mero- 
morphic function. 

Adopting this point of view, S Kowalevskaya 
(Kowalevski 1889) studied the problem of a top 
rotating around a fixed point, and she discovered a 
new case of integrability for this classical problem of 
Hamiltonian mechanics. She searched for conditions 
on the parameters such that the movable singula- 
rities of the solutions in the complex plane of time 
are poles (as a definition, a singularity is movable if 
its location in the complex domain depends on the 
initial conditions). Such differential systems are said 
to be of “Painlevé’s type.” In this case, the solutions 
are single valued in the complex t-plane and there is 
no branching of these solutions. The leading idea is 
the following: a first integral must be constant along 
a solution, and an eventual branching would change 
its value along a loop around a singularity in the 
complex :£-plane. However, finite branchings of 
solutions can be compatible with integrability. 

The main tool in this analysis is the calculation of 
the “Kowalevski’s exponents” which determine the 
eventual branching of a solution around a 
singularity. 

In spite of the efficiency of the Painlevé analysis for 
the search of integrable (or nonintegrable) systems, the 
relation between the analytic properties of their 
solutions (Painlevé) and their integrability in the 
sense of Liouville remains mysterious. The most 
fundamental result obtained in this field is a theorem 
of Adler and van Moerbeke which proves that, if a 
system has the Painlevé property and if it is integrable 
in the sense of Arnol’d—Liouville, then it is algebrai- 
cally integrable (Adler and van Moerbeke 1989). 

The discovery of S Kowalevskaya inspired 
Ziglin, who related the existence of meromorphic 
first integrals for a Hamiltonian system, to the 
properties of the linearized equations along a 
particular periodic solution of this system, espe- 
cially to the monodromy group associated to this 
linear system. Ziglin used the constraints imposed 


to this monodromy group by the existence of first 
integrals. 

Let us consider a Hamiltonian system defined on 
a complex analytic symplectic manifold of dimen- 
sion 2m, and suppose that there exists a family of 
periodic solutions T. The linearized equations 
deduced from the Hamiltonian system along T are 
decoupled into tangential and normal equations. We 
are interested by the normal equations, which are 
linear with periodic coefficients. 


Ziglin’s Theorem (Ziglin 1983). Assume that a 
Hamiltonian system has a family of particular 
solutions T, (which are not equilibria) parametrized 
by periodic functions of the complex time and 
depending analytically on a real parameter h € 
(bi,b3). Let G be the monodromy group of the 
normal variational equation associated to the solu- 
tion Ly. A monodromy matrix g € G is said to be 
nonresonant if every eigenvalue of g is different 
from a root of unity. If the Hamiltonian system bas 
a meromorpbic integral F, functionally independent 
of the Hamiltonian H in a neighborhood of Tp, and 
if the monodromy group G contains a nonresonant 
element gi, then for any g» € G, the commutator 
g'—gj-gi-g-gi satisfies either g*=Id or 
ic fee VO 
g im). 


As a corollary of this theorem, we have sufficient 
conditions of nonintegrability:if the necessary con- 
ditions of integrability of Ziglin are not satisfied by 
a Hamiltonian system, it is not analytically integr- 
able. For instance, this will happen if we can find 
two nonresonant monodromy matrices g; and g» 
which do not commute. If the periodic solution T, 
has two complex periods, the monodromy group G 
has generators g; and g», respectively associated to 
each of these periods and their commutativity can 
be sometimes studied. 

These sufficient conditions of nonintegrability 
were studied for particular Hamiltonian systems, 
first by Ziglin himself. 

Several concrete systems with two degrees of free- 
dom were proved to be nonintegrable by Ito, Yoshida, 
Churchill, Rod, and many other mathematicians who 
applied this “Ziglin’s method": for instance, the 
Hénon-Heiles system, the Yang-Mills system, and 
Hamiltonian systems with a homogeneous potential. 


Nonintegrability and Differential Galois 
Theory (Morales-Ruiz 1999) 


Recently, the integrability of Hamiltonian systems 
was studied with algebraic tools from the differen- 
tial Galois theory, applied to linear differential 
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systems. As in Ziglin’s theory, we consider a 
particular solution [ (not necessarily periodic) of a 
differential system generated by an analytic Hamil- 
tonian H with n degrees of freedom, and the (linear) 
variational equations along I’. The idea is that, if the 
Hamiltonian system is integrable, we can assume 
that the linearized equations along I must also have 
a “regular behavior." If the Hamiltonian system is 
integrable, it will also be the case for the variational 
equations. 

The normal variational system (of order 2n — 2) 
can be written as 


q; 7 KT ()-6 [6] 


with 


K(T(t)) is a matrix depending on the particular 
solution T. 

We have to define the “Galois group” of the linear 
equation [6]. Recall that in the classical Galois 
theory of algebraic equations, the Galois group is 
defined by the automorphisms which map roots 
onto roots of the equation. In an analogous way, in 
the differential Galois theory, we consider the maps 
which send a fundamental solution of eqn [6] on a 
fundamental solution. In order to define the Galois 
group G associated to [6], we consider a differential 
field K of functions over C (i.e., a field of functions 
equipped with a derivation). The field of constants 
of K is C; it is the subfield of K whose elements have 
a derivative equal to zero. We denote by K(€,7, .. .) 
the differential field extension obtained from K by 
the adjunction of the functions £,7,.... If (p, v) is a 
fundamental system of solutions of eqn [6], then 
L=K(y,w) is the smallest differential field exten- 
sion which contains all the solutions of [6]. The 
field of constants of L is the same as the one of K, 
that is, C. By definition, L is a Picard—Vessiot 
extension of K. 

The differential Galois group of L is defined as 
the group of the automorphisms ^ of L (that map a 
solution of [6] onto a solution) leaving the field of 
constants fixed. Given a fundamental system of 
solutions (y, 1/), we can associate to each automorph- 
ism ^y the matrix M such that (7(w), y(wW)) =(y, v). M. 
By definition, the set of these matrices M is the 
Galois group G of eqn [6]. It is a linear algebraic 
group (because, the matrices M being symplectic, 
their coefficients verify polynomial equations) and a 
subgroup of the linear group of matrices GL(C). 
We note that, for a given linear system, the mono- 
dromy group is contained in the Galois group and 
both are subgroups of the symplectic group Sp(C). 


In the Galois group G of eqn [6], we consider G?, 
the connected component of the identity. The integr- 
ability of the initial Hamiltonian system is connected 
to the integrability of the variational equation [6] and, 
through it, to the properties of its Galois group: 


Theorem of Morales and Ramis (Morales-Ruiz 
1999). If an analytic Hamiltonian system is com- 
pletely integrable, then the Galois group associated 
to the variational equation along a particular 
solution V is such that its connected component of 
identity G? is Abelian. 


Thus, if a Hamiltonian system is such that G is not 
Abelian, there cannot exist a complete set of first 
integrals in involution in a neighborhood of the 
particular solution I and the system is not integrable. 

In the concrete applications of this theory, an 
algorithm of Kovacic allows us to determine the 
Galois group explicitly. By this method, several 
Hamiltonian systems were proved to be nonintegrable: 
for instance, systems of points on a line with a 
potential in 1/r?, studied by Julliard-Tosel (1998), 
but also ancient proofs of nonintegrability of homo- 
geneous potentials, which were improved by Yoshida 
and Umeno, thanks to the theorem of Morales-Ramis. 


See also: Billiards in Bounded Convex Domains; 
Infinite-Dimensional Hamiltonian Systems; Integrable 
Systems: Overview; Peakons; Separatrix Splitting. 
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The solar system has long appeared to astronomers 
and mathematicians as a model of stability. On the 
other hand, statistical mechanics relies on the 
assumption that large assemblies of particles form 
highly unstable systems (at the microscopic scale). 
Yet all these physical situations are described, at 
least to a certain degree of approximation, by 
Hamiltonian systems. 

One may hope that Hamiltonian systems can be 
classified in two different categories, stable and 
unstable ones. However, the situation is much more 
complicated and both stable and unstable behaviors 
cohabit in typical systems. Even our examples are 
not perfect paradigms of stability and instability. 
Indeed, it is now clear from numerical as well as 
theoretical points of views that some instability is 
present over long timescales in the solar systems, so 
that for example future collisions between planets 
cannot be completely ruled out in view of our 
present understanding. On the other hand, unex- 
pected patterns of stability have been discovered in 
systems involving a large number of particles. 

Understanding the impact of stable and unstable 
effects in Hamiltonian systems has been considered 
ever since Poincaré as one of the most important 
questions in dynamical systems. In this article, we 
will discuss model Hamiltonian systems of the form 


H.(q.p) = b(p) + eG.(q.p) 


where (gq,p)e T7 xU, with U a bounded open 
subset of R”. Recall that the equations of motion are 


q(t) = Oph(p) + €0,G«(q;p) [1] 


p(t) = —€0,G.(q,p) [2] 


The textbook by Arnol’d (1964) is a good general 
introduction on Hamiltonian systems. We will always 
denote by w(p) the frequency map 0,h(p), which plays 
a crucial role. Here, as is obvious in [2], the action 
variables p are preserved under the evolution in the 
unperturbed case e = 0. We will try to explain what is 
known on the evolution of these action variables for 
the perturbed system. As we will see, in many 
situations, these variables are extremely stable. For 
example, KAM theorem implies that, for a positive 
measure of initial conditions (qo,po) the trajectory 
(q(t), p(t)) satisfies ||p(t) — p(0)|| < Ce for all times. 


Examples show that some initial conditions may lead 
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to unstable trajectories, that is, trajectories such that 
p(t) — p(O)|| > 1/C for some t (depending on c) and 
some fixed constant C independent of e. However, this 
is, as we will see, possible only for very large time £ 
(meaning that £ as a function of e has to go to infinity 
very quickly when e — 0). The main questions here 
are to understand in what situation instability is or is 
not possible, and what kind of evolutions can have the 
action variable p. Another important question is to 
estimate the speed (as a function of the parameter e) of 
the evolutions of p. 


A Convention 


We assume, unless otherwise stated, that the 
Hamiltonians are real analytic. The norm |H| of 
the Hamiltonian H is the uniform norm of its 
holomorphic extension to a certain complex strip. 
We do not specify the width of this strip. 
Whenever we consider a family H,,F,... of 
Hamiltonians, we mean that the norm |H,| is 
bounded when e — 0. 


Averaging and Exponential Stability 


The first observation concerning the action variables 
is that they should evolve at a speed of the order of 
c. However, averaging effects occur. More precisely, 
in the equation p(t) = —c0, H.(q(t), p(t)), the variable 
q(t) is moving fast compared to p(t). If the evolution 
of q(t) nicely fills the torus T", it is tempting to 
think that the averaged equation 


p(t) = —eV.(p(t)) 


should approximate accurately the actual behavior 
of p(t), where 


V.(p):= | = O,H(q, p) dq 


We have V =0, which leads us to think that the 
evolution should consist mainly of oscillations of 
small amplitude with no large evolution. This 
reasoning is limited by the presence of resonances. 


Frequencies 


A frequency wc R^ is said to be resonant if there 
exists k € Z?(— Z^ — (0)) such that (k,w)=0. The 


resonance module of w, 
Z(w) = {kEZ*/(k,w) = 0) 


is a subgroup Z^; we denote by R(w) the vector 
space generated by Z(w) in R^. The order of 
resonance r(w) is the dimension of R(w). The main 
examples of resonances of order r are the 


632 Hamiltonian Systems: Stability and Instability Theory 


frequencies w = (01,0), where w1 € Rd- is nonreso- 
nant. This example is universal. Indeed, if w is a 
resonant frequency, then there exists a matrix 
A€GI4(Z) such that Av — (04,0), where w; € Ro 
is not resonant. The matrix A can be seen as a 
diffeomorphism of T^, which transports the constant 
vector field w to the constant vector field Aw = (w1, 0). 
It is useful to distinguish, among nonresonant 
frequencies, some which are sufficiently nonresonant. 
A frequency w € R? is called Diophantine if there exist 
real constants y > 0 and 7 > d such that 


(o, k)| > yk 


for each ke ZZ. Finally, a frequency is called 
resonant Diophantine if there exists a matrix 
A € GL,CZ) such that w= A(w 1,0), where w1 € Rd is 
a Diophantine frequency. 


Symplectic Diffeomorphisms and Normal Forms 


An efficient mathematical method to take averaging 
effects into account is the use of normal forms. 
Normal form theory consists in finding new coordi- 
nates in which the fast angles have been eliminated 
from the equations up to a small remainder. This is 
done exploiting the existence of a large group of 
diffeomorphisms preserving the Hamiltonian struc- 
ture of equations, called symplectic diffeomorphisms 
or canonical transformations. We refer the reader to 
standard textbooks for these notions, for example to 
Arnol'd (1964). An important point is that a 
symplectic diffeomorphism 由 sends the trajectories 
of the Hamiltonian H o ó to the trajectories of the 
Hamiltonian H. A Hamiltonian N(q, p) is said to be in 
R-normal form, where R is a linear subspace of R", if 
O,N € R for each (q,p). Let us give an illustrative 
result, taken from Lochak et al. (2003). Note that this 
result is not sufficient to obtain uniform stability 
estimates, as in Nekhoroshev theorem below. More 
precise normal form results are given in Nekhoroshev 
(1977) and Poschel (1993). 


Normal Form Theorem 


Let wo =w(po) be a given Diophantine or resonant- 
Diopbantine frequency. Let us denote B,(po) tbe 
open ball of radius r in R? centered at po. There 
exists a constant a which depends only on w, and 
constants ey > 0 and C > 0 such that the following 
holds: for each e< «o, there exists an analytic 
symplectic embedding ġe :T? x Bye 一 一 T? x U, 
which is e-close to identity and such that 


H; o ġe(q, p) = h(p) + eN«(q.p) + le) F.(9, p) 


where N is in R(wo)-normal form, r(c) » Je, and 
ple) € e^". 


This means that the motions with resonant initial 
conditions are confined, up to small oscillations, in 
the associated affine plane p(0) + R(w(p(0)) until 
they live in the domain of the normal form, or until 
time ju! (e). 


Geometry of Resonances 


In view of the normal form theorem, we are led to 
consider the curves P(0): R — R which satisfy 


P(8') — P(0) € R(w(P(O))) 


for each 8 and 6'. Indeed, it appears that these curves 
are the ones the action variables can follow on 
timescales not involving the remainders of the 
normal forms. Note that here the parameter Ó is 
not the physical time. Assuming that P(0) is such a 
curve, we can define the affine space 


R := P(0) + NgerR(w(P(O))) 


We then have P(0) € R for each 8. In addition, each 
point P(60),0 €R, is a critical point of the restriction 
bir of the unperturbed Hamiltonian / to the affine 
space R. It follows that the curve P(0) has to be 
constant if the unperturbed Hamiltonian satisfies the 
following hypothesis. 


Nekhoroshev Steepness 


We say tbat tbe unperturbed Hamiltonian b is steep 
if, for each affine subspace A in R^, the restriction 
ha bas only isolated critical points. 


This formulation, due to Niederman, is much 
simpler than the equivalent one first given by 
Nekhoroshev. It turns out that this condition, 
which was made natural by our heuristic explana- 
tion, implies stability over exponential timescales for 
all initial conditions (see Nekhoroshev (1977)). We 
first need another condition. 


Kolmogorov Nondegeneracy 


We say tbat tbe unperturbed Hamiltonian b is 
nondegenerate in tbe sense of Kolmogorov if it bas 
nondegenerate Hessian at each point, or equivalently 
if the frequency map p'— w(p) is an immersion. 


Nekhoroshev Stability Theorem 


Assume that tbe unperturbed Hamiltonian does not 
bave critical points (w(p) does not vanisb), satisfies 
Nekborosbev steepness and Kolmogorov nondegene- 
racy conditions. Then there exists constants a > 0 
and b > 0, which depend only on b, and constants 
co > 0 and C > 0 such that the following holds: for 
€<€, each trajectory (q.(t),p.(t)) satisfies the 
estimate 
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belt) — pe(O)N € Ce’ 


for all t such that |t| < e " 


Herman's Example 


In order to illustrate the necessity of the condition of 
steepness, let us consider the Hamiltonian 
A.(41, 92; p1: P2) = pipa + €V(q1) 


with V :T — R. The associated equations are 


p2 = 0, qi — p», q2—pi 


The trajectories whose initial conditions are sub- 
jected to p2(0)=0 and V'(q1(0)) Z 0 satisfy 


pi(t) = pi(0) — teV'(q1(0)) 

p2(t)=9, — gilt) = q1 (0) 
We see an evolution at speed e of the action variable 
pı contradicting the conclusion of Nekhoroshev 


theorem. In this example, we have R(w(p(t)))= 
R x {0}, and PRxfo = 0, so that the curve 


P(0) = (0,0) 


pi = —V', 


is indeed a curve of critical points of hyp x (oj. 


Genericity of Steepness 


The condition of steepness is frequently satisfied. In 
order to be more precise, we mention that, for N € N 
large enough (how large depends on the dimension d), 
steepness is a generic condition in the finite-dimen- 
sional space of polynomials of degree less than N. 
Note in contrast that a quadratic Hamiltonian is steep 
if and only if it is positive definite. Finally, it is 
important to mention that convex Hamiltonians / 
with positive-definite Hessian are steep. More gener- 
ally, quasiconvex Hamiltonians are steep. A function 
h : U — R is said to be quasiconvex if, at each point, 
the restriction of its Hessian to the kernel of its 
differential is positive definite. 


The Quasiconvex Case 


It is interesting to be more precise about the values 
of a and b in Nekhoroshev theorem. We shall do so 
in the quasiconvex case, which is the most stable 
case, and where much more is known. If þ is 
quasiconvex, one can take 
1 
= 
as was proved by Lochak (1992). It is a question of 
active present research whether these exponents are 
optimal. It now appears that this is almost so, and 
that the optimal exponent a should not be larger 


than 1/2(d — 3). That this exponent deteriorates as 
the dimension increases is of course very natural in 
the perspective of statistical mechanics. As a matter 
of fact, not only the exponent a but also the 
threshold «9 of validity of Nekhoroshev theorem 
deteriorates with the dimension, as was noticed in 
Bourgain and Kaloshin. 

Another important fact was proved in Lochak 
(1992): in these expressions, the important value of 
d is not the total number of degrees of freedom, but 
the number of active degrees of freedom. More 
precisely, resonant initial conditions are more stable 
than generic ones. If r is the order of resonance of a 
given initial condition, then the number d — r of fast 
angles can be substituted to the total number of 
degrees of freedom for the computation of the 
stability exponent. This phenomenon may account 
for the surprising stability obtained numerically by 
Fermi, Pasta, and Ulam. 


Permanent Stability 


Many initial conditions satisfy more than exponen- 
tial stability: they are permanently stable. 


Kolmogorov Theorem 


Assume that b satisfies Kolmogorov nondegeneracy 
condition (“Kolmogorov nondegeneracy”). Then for 
each open subset V C R^ such that V C U, there 
exists eg > 0 such that, for each € < eo, there exists 


e a smooth symplectic embedding ġ.: Tt x V— 
T x U, which is e-close to the identity, 

e a compact subset F; of V, whose relative measure 
in V is converging to 1 as e — 0, 


such that the Hamiltonian system H; o e preserves 
the torus Te x {p} for each p € F.. 


The union 
Fe = 由 (Te x F,) 


of all the invariant tori has positive measure. Its 
complement is usually an open dense subset of 
T? x U. All the orbits starting in this invariant set 
obviously undergo oscillations of amplitude of the 
order of e for all times. It is worth mentioning that 
some energy surfaces may not intersect the invariant 
set Fe. This is illustrated in example, i.e., “Herman’s 
example,” where the surface of zero energy does not 
contain invariant tori. The following condition 
guarantees the existence of invariant tori on each 
energy surface. 
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Arnol’d Nondegeneracy 


The Hamiltonian h is said to be nondegenerate in 
the sense of Arnol' d if it does not have critical points 


and if tbe map 


w(p) 
Pr Top) 


is a local diffeomorphism between each level set of b 
and S4-!. This is equivalent to say that the function 
(A,p) € R x Ut > Ab(p) bas nondegenerate Hessian 
at each point of the form (1,p). 


Arnol’d Theorem 


If h satisfies Arnol’d nondegeneracy condition, then 
the relative measure of the set F. of invariant tori is 
converging to 1 in each energy surface. 


This theorem prevents ergodicity of the perturbed 
systems for the canonical invariant measure on its 
energy surface. This may be considered as a very 
disappointing result for statistical mechanics, whose 
mathematical foundation has often been considered 
to be the Boltzmann hypothesis of ergodicity. 
However, statistical mechanics is first of all a 
question of letting d go to infinity, and ergodicity 
might not be such a crucial hypothesis (see 
Khinchin). 

When d= 2, the Arnol'd theorem has particularly 
strong consequences. Indeed, in this case, the 
invariant tori cut the energy surfaces in small 
connected components. The motion is then confined 
in these connected components. As a consequence, 
we obtain permanent stability for all initial 
conditions. 

In higher dimensions however, the complement of 
F. in each energy shell is usually a dense, connected 
open set. There may exist orbits wandering in this 
large connected set, although the speed of evolution 
of these orbits is limited by Nekhoroshev theory. 
Understanding the dynamics in this open set is a 
very important and difficult question. It is the 
subject of the next section. 


Relaxed Assumption 


For many applications, such as celestial mechanics, 
the nondegeneracy conditions of Arnol’d or Kolmo- 
gorov are not satisfied, or difficult to check. 
However, the existence of invariant tori has been 
proved under much milder assumptions. As a rule, 
invariant tori exist in the perturbed systems if the 
frequency map p! w(p) stably contains Diophan- 
tine vectors in its image. 


The Mechanism of Arnol’d 


Understanding instability is the subject of intense 
present research. General methods of construction 
of interesting orbits as well as clever classes of 
examples are being developed. These methods are 
exploring the limits of stability theory. Here we shall 
only describe the fundamental ideas of Arnol’d (see 
Arnol’d 1964), where most of the present activity 
finds its roots. Although these ideas have some 
ambition of universality, they are best presented, 
like in Arnol’d (1964), on an example. We consider 
the quasiconvex Hamiltonian 


(41,92, 93;P1;P2,P3) 
= (pe +p5)/2 — p3 + ecos 27q2 
+ u(cos 27q2)(cos 2rqı + cos 27q3) 


As we have seen, this system is typical of the kind of 
Hamiltonians one gets after reduction to resonant 
normal form. However, it is illuminating to consider 
u not as a function of e but as an independent 
parameter. This is an idea of Poincaré then followed 
by Arnol'd. We shall expose the main steps of the 
proof of the following result. 


Theorem 


Let us fix numbers 0 < A < B. For each e > 0, there 
exists a number ole) such that, when 0 < u < mole), 
there exists a trajectory 


(qi(t), qz(t), Pi(t), p2(t)) 


and a time T > 0 (which depends on e and y) such 
that 


pı(0)< A, p1(T)2 B 


The Truncated System 


Let us begin with some remarks about the truncated 
Hamiltonian obtained when p= 0: 


Ho(q.p) = Hi(41, 93, Pi. p3) + H2(q2. p2) 
= pi/2 — p3 + p>/2 + ecos 27q2 


This system is the uncoupled product of Hı and of 
the pendulum described by H2. The variable pı is 
constant along motion; hence, the theorem can not 
hold for 4 — 0. 

Recall that the point q2 = 0, p; — 0 is a hyperbolic 
fixed point of the pendulum H2(q2,p2)=p3/2 + 
e cos 27q2. The stable and unstable manifolds of this 
integrable system coincide; they form the energy 
level Hy=e. As a consequence, in the product 
system of Hamiltonian Ho = Hı + Ho, there exists, 
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in the zero energy level, a one-parameter family T,, 
of invariant tori of dimension 2: 


T, = {p1 =w,p3 =u" /2+6,q2 = 0,p2 = 0) 
CT^xR? 


Each of these tori is hyperbolic in the sense that it has a 
stable manifold of dimension 3 and an unstable 
manifold of dimension 3, which are nothing but the 
liftings of the stable and unstable manifolds of the 
hyperbolic fixed point of H2. Notice that these 
manifolds do not intersect transversally along T. 

When u Æ 0, the perturbation is chosen in such a 
way that the tori T, are left invariant by the 
Hamiltonian flow. 


Splitting 


For 0 < u < ole), the invariant tori T, still have 
stable and unstable manifolds of dimension 3. These 
stable and unstable manifolds intersect transversally 
in the energy surface, along an orbit which is 
homoclinic to the torus. 


The first point is that the tori remain hyperbolic, 
and that the stable and unstable manifolds are 
deformed, but not destroyed by the additional 
term. This results from the observation that the 
manifold M formed by the union of the invariant 
tori is normally hyperbolic in its energy surface. 
Note that this step does not require exponential 
smallness of p. 

It is then a very general result that the stable and 
unstable manifolds have nonempty intersection. It is 
a global property, which can be established by 
variational methods, and which still does not rest on 
exponential smallness of ju. 

The key point, where exponential smallness is 
required, is transversality. Since transversality is a 
generic phenomenon, one may think that this step is 
not so crucial. And indeed, it is very likely that the 
statement remains true for most values of u€ ]0, «| 
(and not only for u € uio(c)). However, there are two 
important issues here. First, transversality is difficult 
to establish on explicit examples. Second, it is useful 
for many further discussions to obtain some quanti- 
tative estimates. 

Indeed, we can associate to the intersection 
between the stable and unstable manifolds a 
quantity, the splitting, which in a sense measures 
transversality. Discussions on such a definition are 
available in Lochak et al. (2003). Using methods of 
Poincaré and Melnikov, Arnol'd showed that this 
splitting can be estimated, for sufficiently small e, by 


a> ue C + Olp?) (3] 


This implies non-nullity of the splitting, hence 
transversality, for small p. 


Transition Chain 


We have established the existence, when sz > 0 is 
small enough, of a family T, of hyperbolic invariant 
tori such that the stable manifold W and the 
unstable manifold W7 intersect transversally along a 
homoclinic orbit (but not along T,,!) for each w. 
.A stability argument shows that the stable mani- 
fold W of the torus T, intersects transversally the 
stable manifold W; of the torus To, when w is close 
enough to wo. How close directly depends on the 
size of the splitting. We obtain heteroclinic orbits 
between tori close to each other. 

Given two values w and w’, we can find a sequence 
wl <i<N, such that wo=w,wy=w’, and W; 
intersects transversally W , for all i. The associated 
family T, of tori is called a transition chain. 

The left step consists in proving that some orbits 
shadow the transition chain. Arnol’d solved this step 
by a very simple topological argument which, 
however, does not provide any estimate on the 
time T. He proves the existence of an orbit joining 
any neighborhood of Tu to any neighborhood of T.y. 
This ends the proof of the main theorem, since we 
can chose w and w such that w < A < B < w. 

The dynamics associated to hyperbolic tori and 
transition chains have later been studied more 
carefully. It particular, a \-lemma can be proved in 
this context, which allows us to conclude that, in a 
transition chain, the unstable manifold Wy; of the 
first torus intersects transversally the stable manifold 
of the last torus Wy. These detailed studies also 
allow us to relate the speed of diffusion to the 
splitting of the invariant manifolds. 


Diffusion Speed 


It is interesting to estimate the speed of evolution of 
the variable pı, or in other words the time T in the 
statement. It follows from Nekhoroshev theory that 
this time T has to be exponentially large as a 
function of e. In fact, it is possible to prove, either by 
recent developments on the ideas of Arnol'd exposed 


above, or more easily by variational methods, (Bessi 
1996) that 


eC/ Vf 
~ —p log y 


for u € mole). This time is of course highly related to 
the estimate [3] of the splitting. In addition, Ugo Bessi 
proved that one can take pole) =e ©/V*. Plugging this 
value of js in the estimate of T, we get the estimate 
T « eC/v* as a function of the only parameter e. 


636 Hamilton-Jacobi Equations and Dynamical Systems: Variational Aspects 


Considering the fact that the orbit we have described 
goes close to double resonances, this is the best 
estimate one may hope for in view of the improved 
Nekhoroshev stability estimates at resonances. 

The idea is now well spread that the time of 
diffusion is exponentially large. However, we point 
out that, if it is indeed exponentially small as a 
function of the parameter e, it is only polynomially 
small as a function of the second parameter ju, as was 
first understood by P Lochak and proved in Bernard 
(1996) using the variational method of U Bessi. 


Conclusion 


The theories of instability are developing in several 
directions. One of them is to try to understand the 
limits of stability, and to test to what extent the 
stability results obtained so far are optimal. This 
aspect has quickly developed recently, for example, 
the optimal stability exponent a for convex systems 
is almost known. Another direction is to try to give 
a description of unstable orbits in typical systems. 
This remains a widely open question. 

Let us finally mention that the application of the 
theories we have presented to concrete systems is 
very difficult. One of the reasons is that the 
estimates of the threshold «o of validity of Nekhor- 
oshev and KAM theorems that can (painfully) be 
obtained by inspection in the proofs are very bad, 
and it is much too bad, for example, to think about 
applications to the solar systems with the physical 
values of the parameters. 
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Overview 


Given a continuous Hamiltonian H(x,p) defined on 
the cotangent bundle of a compact boundaryless 
manifold, where x and p are the state and the 
momentum variable, respectively, and satisfying 
suitable convexity and coercivity assumptions, we 
consider the family of Hamilton—Jacobi equations 


H(x,D$) =a [1] 


See also: Averaging Methods; Hyperbolic Billiards; KAM 
Theory and Celestial Mechanics; Separatrix Splitting; 
Stability Problems in Celestial Mechanics; Stability 
Theory and KAM; Weakly Coupled Oscillators. 
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with a a real parameter. If, in addition, H is 
assumed to be smooth, we also consider the 
Hamilton’s equations 


f= H,(£, 1), 7 = — FLAG, n) [2] 


whose analysis is related to the variational problem 
of minimizing the action functional 


J L(€,) dt [3] 


among all Lipschitz-continuous or, equivalently, 
continuous piecewise C! curves defined on J with 
fixed end points. Here I is a compact interval and L, 
the Lagrangian, is the Fenchel transform of H. A 
"conjugate" flow, named after Euler-Lagrange, is 
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also defined on the tangent bundle of the underlying 
manifold. 

A connection between [1] and [2] is provided by 
the classical Hamilton—Jacobi method, which shows 
that the graph of the differential of any regular, say 
C!, global solution to [1] is an invariant subset for 
the Hamiltonian flow. The drawback of this 
approach is that such regular solutions do not exist 
in general, even for very regular Hamiltonians. 

However, for any continuous Hamiltonian a 
distinguished value of the parameter a can be 
detected, denoted by c and qualified, from now on, 
as critical, for which there are a.e. subsolutions of 
the corresponding Hamilton-Jacobi equations 
enjoying some extremality properties. Note that 
such functions can be equivalently defined as weak 
solutions, in the viscosity sense, of [1] with a — c, or 
as fixed points of the associated Lax-Oleinik 
semigroup (see Fathi (to appear)). We do not give 
these interpretations here to avoid any technicalities. 

Even if they are just Lipschitz-continuous on the 
whole underlying manifold, these extremal subsolu- 
tions become of class C', when restricted on a 
special compact subset, the same for any of them, 
say A, and the corresponding differentials coincide 
on A. More generally, all critical subsolutions, that 
is, the a.e. subsolutions to [1] with a=c, are 
continuously differentiable on .A. This regularity 
property holds if H is at least locally Lipschitz- 
continuous in both variables. When, in addition, the 
Hamiltonian is smooth, so that the Hamiltonian 
flow is defined, the graph of this common differ- 
ential defined in A, denoted by A, is an invariant set 
for the flow, and is foliated by integral curves of [1] 
possessing some global minimizing properties with 
respect to the action functional. 

The aim of this presentation is to give an 
explanation of the previously described phenomena 
occurring at the critical level, and of some related 
facts, using tools and arguments as simply as 
possible. We propose a metric approach to the 
subject and consider as central in our analysis a 
family of distances, denoted by S,, for any a > c. We 
emphasize that such distances can be defined for 
only continuous Hamiltonians, and the qualitative 
analysis of the critical subsolutions has an interest 
independent from the dynamical applications. 
Indeed, it can be used in other contexts such as in 
homogenization problems, and the large-time beha- 
vior of the viscosity solutions to the time-dependent 
equation u; + H(x, Du) = 0. 

The discovery of the critical value has a history 
that reflects the dual character of the topics, which 
has a dynamical as well as a partial differential 
equation (PDE) interest. 


It was probably Ricardo Mané who first focused his 
attention on it, at the beginning of the 1980s, in 
connection with the analysis of integral curves of the 
Euler-Lagrange flow with some global minimizing 
properties. The set, previously denoted by A, has been 
found and analyzed by Serge Aubry, in a purely 
dynamical way, as the union of the supports of such 
minimizing curves. On the other hand, John Mather 
(1986) independently defined, in a more general 
framework, a set, contained in the Aubry set, through 
a weak approach that utilizes minimal probability 
measures invariant with respect to the Euler-Lagrange 
flow. The Mather set is actually the closure of the 
union of the supports of such measures. We will follow 
the approach of Aubry (see Fathi (2005b)), and will 
not introduce the Mather’s measures. 

In the viscosity solution theory, the critical value 
has instead been introduced in a famous unpub- 
lished paper of P L Lions, S R S Varadhan, and 
G Papanicolaou (1987), in connection with some 
periodic homogenization problems for Hamilton- 
Jacobi equations. It is worth noticing that they 
consider continuous Hamiltonian, defined on the 
flat N-dimensional torus, without any convexity 
assumption. 

They define the critical value, and show the 
existence of viscosity solutions to the critical equation 
by means of an ergodic approximation, that is, by 
considering the equation eu + H(x, Du) — 0 and then 
passing to the limit for £ — 0. The critical viscosity 
solutions are used as correctors in the homogeniza- 
tion. They do not perform any qualitative analysis, 
and if such analysis can be done, and something 
similar to the Aubry-Mather sets exists for noncon- 
vex Hamiltonian this is still an important open 
problem. 

The two pieces of the picture were pasted together 
by Fathi (1996) with his weak KAM theory (see 
Contreras and Iturriaga (1999) and Fathi (20052) 
for a general treatment, where the relevance of the 
extremal critical subsolutions has first been recog- 
nized for the analysis of the dynamics, and the 
Aubry-Mather sets have been characterized as a 
regularity set for such subsolutions, as described 
above). Evans and co-workers have been presently 
using more general PDE methods in weak KAM 
theory to address some integrability issues and to 
find a quantum analog (see Evans and Gomes (2001, 
2002) and Evans (2004)). 


Critical Value and Extremal Subsolutions 


We consider the family of Hamilton-Jacobi equa- 
tions [1] defined, for simplicity, on the flat torus 
TN —RN/ZN. endowed with the flat Riemannian 
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metric induced by the Euclidean metric on RY. The 
tangent, as well as the cotangent bundle of T" will 
be identified with T" x R. All the results discussed 
in the remainder of the paper are still true in any 
compact boundaryless manifold, and some of them 
also hold in noncompact manifolds. We require H to 
be continuous in both variables, to satisfy the 
coercivity assumption, 


(Cy, p): H(y,p) € a) is compact for any a 


and the following (strict) quasiconvexity conditions 
for any x e T", ac R: 


{p: H(y,p) € a} is strictly convex 
O{p: H(y,p) <a} = ip: H(y,p) = a} 


where ð, in the above formula, indicates the 
boundary. We denote by S, the (possibly empty) 
set of the Lipschitz-continuous a.e. subsolutions to 
[1]. They will be called in the sequel, for short, just 
subsolutions. Due to the convex character of the 
Hamiltonian and its continuity, the property of 
being a subsolution, for some function z, can be 
equivalently expressed by requiring the inequality 
H(x,p) < a to hold for any x € T^ and any p in the 
(Clarke) generalized gradient Ou(x), defined by 


Ou(x) =co{p = lim Du(x;): 


x; differentiability point of u, lim x; = x] 
i 


where co indicates the convex hull. Note that if this 
set of weak derivatives reduces to a singleton at some 
x, then the function 4 is strictly differentiable at x, 
i.e., it is differentiable and Du is continuous at x. 

By a strict subsolution to [1] we mean a 
Lipschitz-continuous function w with ess sup... H x 
(x,Dw(x))<a. The property of being a (strict) 
subsolution is not affected by addition of constants. 
Moreover, the pointwise supremum (resp. infimum) 
of any class of equibounded subsolution to [1] is 
itself a subsolution, and S, is stable with respect to 
the uniform convergence in T". 

The purpose of this section is to show that there is 
a unique value c (the critical value) for which the 
corresponding equation 


H(x,Du) =c [4| 


possesses subsolutions enjoying some extremality 
properties. We, more precisely, call a subsolution u € 
$, maximal (resp. minimal) if for any open subset 2 
of T and any Lipschitz-continuous function œ with 


u=qd onOQ and ess suppH(x,D¢d(x)) «a [S] 


one has u > ó (resp. u € à) in Q. 


Any maximal (resp. minimal) subsolution 4 is 
actually an a.e. solution of [1]. If, in fact, 
H(xo, Du(xo)) < a for some differentiability point xo 
of u, then the function ó(x) — u(xo) + Du(xo)(x 一 
xo) — elx —xo| +e (resp. ó(x)-— u(xo) + Du(xo)(x— 
xo) + elx — xo| — £) should satisfy [5] for a suitable 
choice of £ > 0 and of a neighborhood 2 of xo, and 
so should violate the maximality (resp. minimality) 
condition for z. 

The previous argument can be easily adapted to 
show something more general: if 4 is a maximal 
(resp. minimal) subsolution then no subtangents 
(resp. supertangents) to u at any y € T" can be local 
strict subsolutions at y, that is, strict subsolutions in 
some neighborhood of y. 

The subtangency (resp. supertangency) condition 
of a function ó to u at a point xo means that xo is a 
local minimizer (resp. maximizer) of u—@. We 
denote by D u(xo) (resp. D*u(xo)) the sets made up 
by the differentials of the C!-subtangent (resp. 
supertangent) to 4 at xo. They are (possibly empty) 
closed convex subsets of Ou(xo). It is apparent that if 
D*u(xo) 4 0 4 D^ u(xo) then u is differentiable at xo 
and D*u(xo) = D' u(xo) = (Du(xo)]. 

It is an immediate consequence of the previous 
fact that no extremal subsolutions can exist in S,, 
whenever [1] admits a strict subsolution, say @, since 
there are global minimizers and maximizers of u — à, 
for any u € Sa, because of the compactness of T". 
The function ¢ is then subtangent and supertangent, 
respectively, to u at such points. 

The unique value we can look at for finding 
extremal subsolutions is therefore 


c — inf(a € R: S, #0} [6 


The set on the right-hand side of [6] is nonempty 
since the null function belongs to S, when a> 
max H(x,0), and bounded from below by 
minn H(x,0). The value c is consequently well 
defined by [6]. | 

Moreover, any sequence un € $,, with a, 
decreasing and convergent to c, is equi-Lipschitz- 
continuous because of the coercivity of H, and 
equibounded, up to addition of suitable constants. 
It is therefore uniformly convergent, up to a 
subsequence, to some z, which belongs to S,,, 
for any n, since these classes are stable for the 
uniform convergence. This implies that wu is a 
subsolution to [4], so that S- Z 0. The critical 
value c is then characterized by the property that 
the corresponding eqn [4] admits subsolutions 
but not strict subsolutions. Our aim is to show 
that extremal subsolutions do exist for the critical 
eqn [4]. 
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For any supercritical value a, that is, a > c, we can 
define the functional nonsymmetric semidistance: 


Sa(y, x) = sup{u(x) — u(y): u € Sa} 
= sup{u(x): u € Sa, u(y) = 0} 


for any x, y in T^. It is immediate that S, satisfies 
the triangle inequality and S,(y, y) ^ 0 for any y. But 
it fails, in general, to be symmetric and positive if 
x #y. We will nevertheless call it a distance, in the 
sequel, to ease terminology. The function 
x — Sa(y, x) is itself a subsolution to [1], for any y, 
being the pointwise supremum of a family of 
equibounded subsolutions. Taking into account the 
inequality 


u(x) — u(y) 2 —Sa(x,y) 


which holds for any u € Sa, and the fact that it 
becomes an equality by setting u = S,(x, - ), we also get 


—S,(x, y) = inf(u(x) — u(y): u € Sa} 
= inf{u(x): u € Sa, u(y) = 0} 


and —S,(-,¥) is, as well, a subsolution to [1]. Note 
that 


Sa(x,y)+Sa(y,x) 20 for any y,x [7] 


The interest of introducing the distance $, in the 
present context is that, for any a > c and ye T^, 
the function x — S,(y,x) (resp. x — —S,(x,y)) satis- 
fies the maximality (resp. minimality) condition for 
subsolutions of [1] in any open set not containing y. 
If, by contradiction, the maximality property of 
Saly, - ) were violated in some open set 2 with y ¢ Q 
by a @ satisfying [5] then one could make the set 
(x: d(x) > Sa(y,x)} nonempty and compactly con- 
tained in Q, by adding a suitable constant. Hence, 
the formula 


in (2 
otherwise 


u = | max{¢, Saly, jl 
Saly, :) 


could provide a  subsolution to [1] with 
u(y) — Sa(y, y) 2 0 and u > S,(y,-) at some point of 
Q, which is in contrast with the very definition of S,. 
One can similarly prove the minimality condition 
for —S,( - , y). 

We now focus our attention on the critical case. 
We derive from the previous considerations that if a 
maximal subsolution to [4] does not exist then, for 
any y, we can find a neighborhood (5, of y where 
S.(y, - ) fails to be maximal. We can thus construct, 
through a formula like [8], a uy € Se with 


[8] 


ess supp H(-, Duy(-)) < c [9] 


in some neighborhood Qy of y contained in (£L. 
Thanks to the compactness of T, we can extract 
from (Q,] a finite subcover {Qy,},i=1,...,m, for 
some 77 € N, and define 


u = 》 Ajtty, 
1 


where A; are positive constants with 77 A; — 1. The 
convex character of the Hamiltonian and [9] imply 
that u is a strict critical subsolution, which cannot 
be. We therefore conclude that there is a nonempty 
subset of y, denoted henceforth by .A, for which 
S.(y,- ) is indeed a maximal critical subsolution. It 
can also be proved, by exploiting some stability 
properties of the maximal subsolutions, that .A is 
closed. Similarly, —S,(-,y) must be a minimal 
critical subsolution for some y. We denote by .A 
the closed set made up by such points. 

The previous covering argument shows that if 
yg A (resp. y Z A) then there is a local strict 
critical subsolution at y. The converse is also true: 
let in fact ó be such a strict subsolution satisfying 
d(y) — Sc (y, y) 20; then $ is subtangent to S,(y, - ) 
(resp. supertangent to —S,(-,y)) at y, by the very 
definition of the distance S.. This shows that S,(y, - ) 
(resp. —S,(-,y)) is not a maximal (resp. minimal) 
critical subsolution, and so y ¢ A (resp. y ¢ A). Since 
the previous characterization holds for both A and 
A, it follows that A= A. This set is a generalization 
of the (projected) Aubry set. We will come back on 
this point later on. 

We also see from the covering argument that there 
is a critical subsolution ¢, which is strict outside A, 
that is, such that ess supo H(x, Dó(x)) < c for any 
open set 2 compactly contained in T^ A. 

This implies that any y such that (p: H(y,p) < c] 
has empty interior, belongs to A. The empty interior 
condition in fact implies, thanks to the strict 
quasiconvexity of H, that the sublevel set reduces 
to a singleton, say {po}. We know that Ou(y) C 
(p: H(y,p) € c}, for any u € Sa; therefore, Ou(y) is a 
singleton and so any critical subsolution z is strictly 
differentiable at y with H(y, Du(y)) = H(y, po) =c. 
Hence, there cannot be critical subsolutions which 
are strict around y. 

The previously described points will be called, in 
the sequel, equilibria, and the (possibly empty) 
closed set made up by them will be denoted by £. 
The reason of this terminology will be explained 
later. The differentiability property of the critical 
subsolutions at equilibria, can be extended, quite 
surprisingly, to any point of A, under more stringent 
assumptions on H. We will discuss this issue in the 
next section. 
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Qualitative Properties of Generalized 
Aubry Set 


We introduce some dynamical aspects in the picture 
by showing that the distances S,, defined in the 
previous section for any a > c, are actually of length 
type, in the sense that S,(y,x) equals, for any pair y, 
x, the infimum of the intrinsic length of absolutely 
continuous, or equivalently Lipschitz-continuous, 
curves joining y to x. By intrinsic length, we mean 
the total variation of S, on the curve. It will be 
denoted by +, while £ will indicate the natural (i.e., 
Euclidean) length. 

For this purpose, we proceed to give a line- 
integral representation formula of Sa. To start with, 
we consider a C! subsolution u to [1], some x, y T 
and a (Lipschitz-continuous) curve £, defined in 
some compact interval I, joining y to x. We have 


u(x) — u(y) = J Du € dt < / oa(€,€)dt X [10] 


where, for any (x,v) eT x R*,o,(x,v):— 


maXpez,(x) pv and 
Za(x) := (p: H(x.p) € aj 


Inequality [10] also holds for a Lipschitz-continuous 
subsolution to [1] through suitable replacement of 
the differential by the generalized gradient. The set- 
valued map Z, is compact convex valued, by the 
coercivity and quasiconvexity assumptions on H, 
and continuous with respect to the Hausdorff 
metric. The function o; is accordingly continuous 
in the first variable, and convex and positively 
homogeneous in the second, being a support func- 
tion. This implies in particular that the integral on 
the right-hand side of [10] is invariant under change 
of parameter preserving the orientation. We derive, 
from [10], 


1 
Sa(y,x) < inf F o,(€,€) dt: £ defined 
0 
in [0, 1] and joining y to x} [11] 


for any y,x. We denote by S,(y,x) the quantity on 
the right-hand side of [11]. It is immediate that the 
triangle inequality holds for $,. The function 
u:=S,(y,-) is, moreover, Lipschitz-continuous 
since ~o,(x,v)/|v| is bounded from above in 
TN x (RF (0]) because of the coercivity of H. Given 
v € RN, we exploit the definition of Sa, the 
continuity of oa, and the triangle inequality for S4, 


to get at any differentiability point xo of u, 


< lim su = 

a b 

1 "1 
< jim, L i Gal(xo — hut, bv) dt 
T 
= lim Oq(xo — but, v) dt 
b—0* Jo 
= Og (xo, V) 
This implies by  Hahn-Banach theorem that 


Du(xo) € Z4(x) or, in other terms, that u —$,(y,-) € 
S,. We then derive, from [11] and the very 
definition of Sz, 


l . 
Say, x) e inf | | olé £) dt: 


0 


€ defined in [0,1] with €(0) = y, 
eU) = x) 


Taking into account that the integral functional 
appearing in the previous formula is lower semicon- 
tinuous for the uniform convergence of equi- 
Lipschitz-continuous sequence of curves, by standard 
variational results, we in turn infer that it equals the 
intrinsic length £,. Mathematically, 


(E = J c. (6, £) dt 


for any compact interval J and any curve £ defined 
in I, 

Since S, is just a semidistance, we do not have any 
a priori information on the sign of £,; however, by 
[10], the intrinsic length of any cycle must be non- 
negative. Furthermore, while |/,(€)| must be small 
for any curve € with small natural length, by the 
coercivity condition on H, no converse estimates 
hold, in general. If a > c, some information in this 
direction can be gathered by taking a strict subsolu- 
tion ó to [1], that it can be assumed smooth, up to 
regularization by  mollification, then  Dó(x)v < 
o,(x,v) — p|v| for any (x,v) € T x R, and some 
p > 0, and consequently 


t,(€) > J (oa(E,£) — Do(£)6) dt 
+ (x) — (y) 2 pt(£) —Sa(x,y) [12] 


for any pair y, x and any curve £, defined in some 
interval I, joining y to x. The previous formula says, 
in particular, that when |x — y| is small then any 
curve whose intrinsic length approximates S,(y, x) 
must have small natural length. The previous 


Hamilton-Jacobi Equations and Dynamical Systems: Variational Aspects 641 


argument cannot be extended to the critical case. 
This gap suggests the next definition. The main 
purpose for introducing it is to get a metric 
characterization of the Aubry set A. 

We say that 5, is localizable at some y if for every 
E€ > 0 there is 0 < 6. < & such that 


S-(y,x) = inf(Z,(£): £ joins y to x and (£) <£} [13] 


whenever |x — y| < 6-. If y Z A, we adapt the argu- 
ment previously used in the strict subcritical case to 
get that S, is indeed localizable at y. In this case we 
have, in fact, at our disposal a critical subsolution, 
say ó, which is strict in some neighborhood €) of y, 
thanks to the characterization of the Aubry set given 
in the previous section. 

We assume, to simplify, $ to be C'; under the 
natural condition of Lipschitz-continuity, general- 
ized gradients should be used in place of differ- 
entials. We have Déó(x)v € o(x,v) — p|v| for any 
x EN, anyve R, and some p > 0, and Dó(x)v < 
c(x,v), for any x, v. Exploiting these inequalities, we 
obtain an estimate analogous to [12] for curves 
starting from y, which allows us to prove [13]. 

Conversely, let y gE be a point where Se is 
localizable. We claim that Z.(y) C D u(y), where 
u:=S-.(y,-). It is enough to show that any po in the 
interior of Ze(y) belongs to D u(y), since D u(y) is 
closed. Note that the interior of Z,(y) is nonempty 
since we are assuming that y is not an equilibrium. 
Such a po belongs to the interior of Z,(x) for x 
sufficiently close to y, thanks to the continuity of Ze; 
consequently, p(x — y) < Z,(£) for any x close to y 
and any curve £ joining y to x with /(£) sufficiently 
small. Taking into account [13], we then deduce 


p(x —y) < S-(y,x) for x close to y 


and so the linear function ó(x):— po(x — y) is 
subtangent to z at y. This in turn implies that y is 
out of A since à is a local strict critical subsolution at 
y, and so S,(y, - ) cannot be a maximal subsolution by 
the characterization given in the previous section. 

The fact that 5, is not localizable at any point of 
y € A\E leads to the announced metric characteriza- 
tion of A. If y is such a point, there is an £ > 0, a 
point x, with |x — y|, and so |S.(y,x)|, as small as 
desired, and a curve € joining y to x with £.(£) ~ 
S.(y, x) and /(£) > £. We construct a cycle y, passing 
through y, by juxtaposition of £ and the Euclidean 
segment joining x to y. We obtain, in this way, a 
sequence of cycles ^, passing through y, with length 
belyn) — 0 and /(^,) > £, for any n. 

The same result can also be obtained for y € £. In 
this case we select ¢>0 and v9 € R" with 
a-(y,U¥o)=0, and denote by B, a sequence of 


Euclidean balls, centered at y, satisfying o.(: ,v0) < 
1/n in B,. We construct a sequence of cycles, 
passing through y, by going up and down on the 
line (y + sv] in such a way that ?,(t) € Bn, for every 
t, and £ < llyn) < 2e; therefore 0 «€ £,(»,) « 2e/n. 

Conversely, such a sequence of cycles cannot exist 
at any y ¢ A because S, is localizable at y. 

We emphasize that the previous definition of .A 
through cycles and the fact that S, is not localizable 
at any point y € A with intZ,(y) #0 shows that, 
apart for the special case of equilibria, the property of 
being a point of .A is definitively not of local nature. 

As pointed out already, if y A, and so S, is 
localizable at y, then  Z,(y) CD u(y), where 
u:=S-(y,-); on the other hand, we know that 
D^ u(y) C Ou(y) and Ou(y) C Z.(y), where the latter 
inclusion holds since z is a critical subsolution. We 
then derive 


D u(y) = Ou(y) = Z«(y) 


We interpret these inequalities as a convexity-type 
property, or, to use a more appropriate terminology, 
a semiconvexity property of the distance function 
S-(y,-) at y. The same property holds for the 
Euclidean distance function |x| at 0. 

A contrasting phenomenon takes place if y € A, 
namely $,.(y,-) is semiconcave at y, which means 
that Diu(y)= Ou(y). This is more complicated to 
prove (see Fathi 2005b), and requires, in addition, H 
to be strictly convex in p and locally Lipschitz- 
continuous in (x,p). Under these assumptions one 
can, more generally, show that S.(y,-) is semicon- 
cave in T, if y € A, while it is semiconcave in 
T\fy} and semiconvex in y, if yg A. Some 
important consequences can be deduced. 

First, thanks to the semiconcavity property there 
are C! supertangents to 1:— S,(y,-) at y, whenever 
y € A. Such a function, say ó, is also supertangent to 
—S.( - , y), which is a minimal critical subsolution, at 
the same point. We know from the previous section 
that no supertangents to —S,(-,y) at y can be strict 
critical subsolution locally at y, and so 
H(y, Dó(y)) 2c. This implies that D*u(y) is con- 
tained in the boundary of Z-(y). We then see, taking 
into account that D* u(y) is convex and Z,(y) strictly 
convex, that D*z(y) reduces to a singleton, and so, 
by the semiconcavity property, Ou(y) reduces to a 
singleton. Therefore, S.(y, - ) is strictly differentiable 
at y, for any y € A. One can similarly show that 
—S.( - , y) is strictly differentiable at y. 

Second, given y € A and a critical subsolution w, 
which can be assumed, up to addition of a constant, 
to vanish at y, we see that S.(y, - ) (resp. —Se( - , y)) is 
supertangent (resp. subtangent) at y because of 
its extremality properties. Since both these 
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super(sub)-tangents are differentiable, by the previous 
point, we deduce that w itself is differentiable at y. 
Moreover, the differentials at y of all three functions 
under consideration, namely S,(y,-), —S,(-,y), and 
w, coincide. In particular, H(y,Dw(y))=c, and 
y — Dw(y) is continuous on A, since S, (y, - ) has been 
proved to be strictly differentiable at y, whenever y € 
A. Any critical subsolution, restricted to A, is conse- 
quently a continuously differentiable solution to [4]. 

Summing up, we have discovered (under the 
assumption of strict convexity and  Lipschitz- 
continuity for H) that every critical subsolution is 
differentiable on A, and the differential on A is the 
same for every critical subsolution. A continuous 
map G:.A— RM is then defined by taking G(y) 
equal to the common differential of any critical 
subsolution at y. We denote by .A the graph of G, 
which is a subset of the cotangent bundle of TY, 
identified with T" x RN. 

As we have already pointed out, the existence of a 
C™ subsolution to [1] is obvious when a > c, and 
such a subsolution can be obtained through a 
suitable regularization by mollification of any strict 
subsolution. The same construction cannot be 
performed at the critical level, since no strict critical 
subsolutions are available to start the regularization 
procedure. We can nevertheless show the existence 
of C! critical subsolutions by exploiting the infor- 
mation gathered on the Aubry set. We start by 
considering a countable locally finite open cover of 
T^^, (Qj); we know from the previous section that 
there is a critical subsolution, say w;, which is strict 
on €); for any i. Loosely speaking, we have some 
space, also in this case, for regularizing w; in such a 
way that the regularized function is still a critical 
subsolution, at least on €). 

We can glue together, with some precautions, 
these regularized local critical subsolutions through 
a C™ partition of the unity, to produce a critical 
subsolution which is C™ outside A. Using the fact 
that any critical subsolution is differentiable on .A, 
we can further adjust the previous construction so 
that the critical subsolution is C! on the whole T^. 
We state this result in the following way: if the 
equation [1] has a subsolution then it also has a C? 
subsolution. It is worth noticing that it holds even if 
the underlying manifold is noncompact (see Fathi 
(2004, 2005b)). 


The Intrinsic Lengths and the Action 
Functional 


Here we assume H to satisfy all the usual assump- 
tions in order to define the Hamilton’s equations [2] 


and to have the completeness of the associated 
Hamiltonian flow. Namely, we require H to be C? 
in both variables, C?-strictly convex, that is, App > 
0 in T" x R^ and superlinear, in the sense that 


H(x, p) 
p—oe |p| 


一 十 co uniformly in x 


We define the Lagrangian L as the Fenchel 
transform of H. It takes finite values thanks to the 
superlinearity condition, and, in addition, inherits, 
from H,C? regularity, C?-strict convexity and 
superlinearity. In our setting, the Fenchel transform 
is involutive. 

We call a vector vo and a covector po conjugate at 
a point x if vo = Hy(x, po), and so L(x, vo) = povo 一 
H(x, po). This also implies the relations po = L,(x, vo) 
and H(x,po)-— povo — L(x,vo). If H(x,po) —a, for 
some a, then povo — o,;(x,vo), and po is the unique 
element of Z,(x9) for which such a relation holds. 
Since the function y — povo — H(y,po) is subtangent 
to L(-,vo) at x, we see that L,(x, vo) = —H,(x, po). 

We introduce, for any (Lipschitz-continuous) 
curve € defined in [a,b], for some a « b, the action 
functional A(£) through 


A() = | LE de 


We say that the curve € is a minimizer of the action 
if A(£) € A(y) for any y defined [a,b] and with the 
same end points of €. It is a classical result in 
calculus of variations that any of such minimizers £ 
is of class C? and satisfies the Euler-Lagrange 
equation 


d : CM 
ELE È) = Le(E,€) in Ja, | 


Consequently, € and the conjugate curve 
n=L,(€,€) satisfy the Hamilton's equations [2]. 
Note that all the integral curves of [2] lie in a fixed 
level of the Hamiltonian, which is compact by the 
superlinearity condition. The corresponding Hamil- 
tonian flow is consequently complete. 

We show that if x9 € €, and Z.(xo) ^ (G(xo)], 
then (xo, G(xo)) is a steady state of the Hamiltonian 
flow. In this case, in fact, c= min, H(xo, p) and so 
L(xo, 0) 2 —c and Hj (xo, G(xo)) — 0, or equivalently 
G(xo) and 0 are conjugate at xo. Taking into 
account that c is the critical value, we have that 


L(x,0) 2 — min H(x,p)2 —c for any x€ Sis 


so that xo is a minimizer of x L(x,0) and 
Lx(x0, 0) = — H,(xo, G(xo) — 0. It is easy to see that, 


Oo I4 C n 
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conversely, if (xo,po) is a steady state of the 
Hamiltonian flow and H(xo,po)—c then xo € € 
and po = G(xo). 

We want to establish a relation between A( - ) and 
the length functionals £, defined in the previous 
section for a >c. This will allow, among other 
things, to show that the Aubry set A is invariant for 
the Hamiltonian flow and to analyze the properties 
of the integral curves lying on it. To this aim, we 
consider the minimal geodesics for $,,a > c, that is, 
the curves, defined on compact intervals, whose 
intrinsic lengths /, equal the distance Są between 
their end points. 

If a > c, we claim that, given any pair of points in 
T^, there is a minimal geodesics joining them. 
Recalling the formula [12], whose validity depends 
on the fact that in the strict supercritical case there is 
a smooth strict subsolution to [1], we have 


£,(£) > +00 whenever (E) 一 十 oo 


The claim is then proved by using the Ascoli 
theorem and the lower-semicontinuity property of 
la. In the critical case, given y ¢ A, we can use the 
same argument to deduce the existence of minimiz- 
ing geodesics for Se between y and any point x 
sufficiently close to y (in the Euclidean sense). This 
comes from the fact that S, is localizable at y, and so 
any sequence of curves £, with £,(£,) 一 Sely, x) has 
bounded natural length. For a general pair of points, 
we will show, on the contrary, that existence of a 
minimal geodesic is not guaranteed in the critical 
case. 

We consider a minimizing geodesics £ for $, 
between a pair of points y and x. We assume a > c 
or a —c and £n £&—(). We want to show that £ is a 
minimizing curve for the action, up to a change of 
parameter. We choose the new parameter in such a 
way that 


LEŠ +a =E €) [14] 


where we have denoted by € the reparametrized 
curve. Since Ê stays away from £, the velocities |£| 
are bounded from below by a positive constant and 
so the domain of definition of £, denoted by [0, T], is 
a compact interval. Note that £,(£) = 4,(£), since the 
intrinsic length is invariant under change of para- 
meter. We take into account that € is a minimal 
geodesic and the inequality L(x,v) -- a > calx, v), 
which holds for any x, v, to get 


A(€) = £(£) — aT € L(y) — aT € Aly) 


for any y defined in [0, T] with 4(0) — y, 4(T) =x. 
This proves the announced minimality property of £. 


Furthermore, we show that the function 
u := Saly, - ) is strictly differentiable at £(s), for s € 
JO, T[, and 


Du(£) = L = [15] 


L,(£, 
in [0, TJ. Hence, (£, Dz(£)) is a solution of the 
Hamilton's equations in ]0, T[. To see this, we start 
from the relations 


| 3 (Es) ds = ulë = /  e.(É b ds 


which hold in [0, T] because u is Lipschitz-contin- 
uous, 6 is a, minimizing geodesic, and (s) is 
conjugate to £(s) at £(s) for any sc[0, T]. We 
know that 


e i > 
Se) = pels) 


for a.e. s and some p € Ou(E(s)) 


We have that p € Z.(E(s)), since u is a critical 
subsolution, and so 


p£(s) < o«(E(s), €(s)) = n(s)£(s)) 


We see, in the light of [16], that equality must hold 
in the previous formula, for a.e. s. Therefore, 


S (Es) = ns) ls) 


we derive from the fact that the function n( - )£( - ) is 
continuous that (z(£(-)) is actually continuously 
differentiable in JO, T[ and that [17] holds for any s. 
We finally exploit that u is semiconcave in 'T^^ [y], 
as pointed out in the previous section, and so 


for a.e. s [17] 


D*u(£(s) —Ou(£(s), for any s. If ¢ is a Cl- 
supertangent to u at £(s) then 
m * d " 
Dé(E())&()) = Fulls) 


accordingly, 


pé(s) =n(s)é(s) for any s and p € Ou(£(s)) 


Since Óu(£(s)) C Z.(£(s)), this implies that Ou(£(s)) = 
(n(s)). This actually gives the strict differentiability 
function u at £(s), and Du(£(s)) ^ (s) for any s. 
The same argument works, with some adjustment, 
also when a —c and EN E £ (V. If, for instance, y ¢ 
E, to = min [t: E(t) € E}, then by reparametrizing £ in 
[0, £o], as indicated in [14], we get a curve é defined 
in [0, -oo| which is a minimizer of the action 
functional in any compact interval contained in 
[0,4-oc[. Moreover, u:=S,(y,-) is strictly 
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differentiable in ]0, --oc[ and (£, Du(£)) is a solution 
of the Hamilton's equations. 

We proceed to investigate the properties of the 
Hamiltonian flow on A. We take a yo in A\&, and 
consider a sequence £, of cycles passing through yo 
with £.(£,)) 一 0, &(&,) > 26, for some positive 6. Such 
a sequence does exist in view of the characterization 
of .A through cycles given in the previous section. 
Moreover, we assume that the £,, are parametrized by 
the natural arc length in [一 也, T,], for some T, > 6, 
and satisfy £,(0) = yo for any n. There is then defined 
a uniform limit curve y in [—6,6], up to a 
subsequence, thanks to the Ascoli theorem. 

The idea is to construct a new sequence of cycles 
Yn by replacing the portion of the £, between —ó and 
ô by y, and pasting this new piece with the 
remainder of £, through Euclidean segments at the 
end points. The ^, are still of infinitesimal intrinsic 
length £., which shows, in particular, that ^ is 
contained in A. By exploiting that S, is a length 
distance, that the y,, are cycles, and the formula [7], 
with a =c, we get 


£c (0s) = Sc(y(—6), ¥(8)) + Sells), (—6)) 
>0 


for any n, and we at last derive 


6 (y) = Sely), *(78)) = — Sc(Y C76), (6) 


Note that the second equality is actually redundant. 
By reparametrizing y, as in [14], with a =c, in some 
open interval containing 0 as interior point and 
contained in [—6,6], we get a curve contained in 
A\E€, denoted by & defined on some open interval I 
and satisfying 


A(&lis.n) T c(t - s)) P. £c (8l à) 
= —S-(€(t), &(s)) 


This, in particular, shows that € is a minimizer of the 
action functional in any [s,t] C I. If we denote, as 
usual, by 7 the curve conjugate to £, we have, 
arguing as above, that 7(t) is the differential of the 
function S.(£(s), - ) at £(t), but, since the differentials 
of all critical subsolutions coincide on A, we finally 
get that n(t)— G(£(t) for every t€ I. Therefore, 
(£, G(£)) is a solution of the Hamilton's equation in 
I and is contained in A. The same properties can be 
extended on the whole R. 

Taking into account that if y € € then (y, G(y)) is 
a steady state of the Hamiltonian flow, we in the 
end see that .A is foliated by integral curves of 
the Hamiltonian flow (£, G(£)), with € enjoying the 
variational property [18]. This is indeed a 


for any t > s [18] 


characterization since if, conversely, a curve € 
satisfies [18] then it must be contained in .A. 

As an application, we finally show that there 
cannot be minimal geodesics, for the critical metric 
S., joining a point of A, say y, to some x ¢ A, at 
least when £ —(). If such a geodesic, say €, exists, 
and is defined in [0,7], for some T > 0, then 
(£, Du(£)) is a solution of the Hamilton's equations, 
up to a change of parameter, where z:-—S(y,:), 
satisfying the initial conditions  £(0) — yo, (0) — 
lim, ,0+ Du(&(t)). 

The last relation tells us that 7(0) € Ou(y) and, 
since z is differentiable at y € A with Du(y) = G(y), 
we conclude that 7(0) = G(y). Therefore, (£, Du(£)) is 
a part of the integral curve of the Hamiltonian flow 
starting at (y, G(y)) that we know, by the above 
reasoning, to be contained in A, which is in 
contradiction with £(T) — x £ .A. 


See also: Control Problems in Mathematical Physics; 
Dynamical Systems in Mathematical Physics: An 
Illustratrion form Water Waves; KAM Theory and Celestial 
Mechanics; Minimax Principle in the Calculus of Variations; 
Optimal Transportation; Stability Theory and KAM. 
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Introduction 


The phenomenon of superconductivity is one of the 
most profound manifestations of quantum 
mechanics in the macroscopic world. The celebrated 
Bardeen—Cooper-Schrieffer (BCS) theory (Bardeen 
et al. 1957) of superconductivity (SC) provides a 
basic theoretical framework to understand this 
remarkable phenomenon in terms of the pairing of 
electrons with opposite spin and momenta to form a 
collective condensate state. This theory does not 
only quantitatively explain the experimental data of 
conventional superconductors, the basic concepts 
developed from this theory, including the concept of 
spontaneous broken symmetry, the Nambu-Gold- 
stone modes and the Anderson-Higgs mechanism 
provide the essential building blocks for the unified 
theory of fundamental forces. The discovery of high- 
temperature superconductivity (HTSC) in the copper 
oxide material poses a profound challenge to 
theoretically understand the phenomenon of super- 
conductivity in the extreme limit of strong correla- 
tions. While the basic idea of electron pairing in the 
BCS theory carries over to the HTSC, other aspects 
like the weak coupling mean field approximation 
and the phonon mediated pairing mechanism may 
not apply without modifications. Therefore, HTSC 
system provides an exciting opportunity to develop 
new theoretical frameworks and concepts for 
strongly correlated electronic systems. 

To date, a number of different HTSC materials have 
been discovered. The most studied ones include the 
hole-doped La? ,Sr,CuO4,; (LSCO), YBa»Cus3O;,; 
(YBCO), Bi?Sr» CaCu5 Og. 5 (BSCO), Tl; Ba» CuO. ,5 
(TBCO) materials and the electron-doped Nd». Ce, 
CuO4 (NCCO) material. All these materials have a 
two-dimensional (2D) CuO; plane, and have an 
antiferromagnetic (AF) insulating phase at half-filling. 
The magnetic properties of this insulating phase is well 
approximated by the antiferromagnetic Heisenberg 
model with spin $ — 1/2 and an AF exchange constant 
J ^ 100 meV. The Neel temperature for the 3D AF 
ordering is approximately given by Ty ~ 300 ~ 


500K. The HTSC material can be doped either by 
holes or by electrons. In the doping range of 
5% <x < 15%, there is an SC phase with a dom-like 
shape in the temperature versus doping plane. The 
maximal SC transition temperature T; is of the order 
of 100K. The generic phase diagram of HTSC is 
shown in Figure 1. 

One of the main questions concerning the HTSC 
phase diagram is the transition region between the 
AF and the SC phases. Partly because of the 
complicated material chemistry in this regime, 
there is no universal agreement among different 
experiments. Different experiments indicate several 
different possibilities, including phase separation 
with an inhomogeneous density distribution, uni- 
form coexistence phase between AF and SC and 
periodically ordered spin and charge distributions in 
the form of stripes or checkerboards. 

The phase diagram of the HTSC cuprates also 
contains a regime with anomalous behaviors con- 
ventionally called the pseudogap phase. This region 
of the phase diagram is indicated by the dashed lines 
in Figure 1. In conventional superconductors, a 
pairing gap opens up at Te. In a large class of HTSC 
cuprates, however, an electronic gap starts to open 
up at a temperature much higher than Te. Many 
experiments indicate that the pseudogap “phase” is 
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Figure 1 Phase diagram of the of the NCCO and the YBCO 
superconductors. 
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not a true thermodynamical phase, but rather the 
precursor towards a crossover behavior. 

The SC phase of the HTSC has a number of 
striking properties not shared by conventional 
superconductors. First of all, phase-sensitive experi- 
ments indicate that the SC phase for most of the 
cuprates has d wave like pairing symmetry. This is 
also supported by the photoemission experiments 
which show the existence of the nodal points in the 
quasiparticle gap. Neutron scattering experiments 
find a new type of collective mode, carrying spin 1, 
lattice momentum close to (7,7), and a resolution- 
limited sharp resonance energy around 20-40 meV. 
Most remarkably, this resonance mode appears only 
below T; of the optimally doped cuprates. Another 
property uniquely different from the conventional 
superconductors is the vortex state. Most HTSCs are 
type II superconductors where the magnetic field can 
penetrate into the SC state in the form of a vortex 
lattice, where the SC order is destroyed at the center 
of the vortex core. In conventional superconductors, 
the vortex core is filled by the normal metallic 
electrons. However, a number of different experi- 
mental probes, including neutron scattering, muon 
spin resonance (usR), and nuclear magnetic reso- 
nance (NMR), have shown that the vortex cores in 
the HTSC cuprates are antiferromagnetic, rather 
than normal metallic. This phenomenon has been 
observed in almost all HTSC materials, including 
LSCO, YBCO, TBCO, and NSCO, making it one of 
the most universal properties of the HTSC cuprates. 

The HTSC materials also have highly unusual 
transport properties. While conventional metals 
have a T? dependence of resistivity, in accordance 
with the predictions of the Fermi liquid theory, the 
HTSC materials have a linear T dependence of 
resistivity near optimal doping. This linear T 
dependence extends over a wide temperature win- 
dow, and seems to be universal among most of the 
cuprates. When the underdoped or sometimes 
optimally doped SC state is destroyed by applying 
a high magnetic field, the *normal state" is not a 
conventional conducting state, but exhibits insula- 
tor-like behavior, at least along the c-axis. This 
phenomenon may be related to the insulating AF 
vortices mentioned in the previous paragraph. 

The discovery of HTSC has greatly stimulated the 
theoretical understanding of superconductivity in 
strongly correlated systems. There are a number of 
promising approaches, partially reviewed in Dagotto 
(1994), Imada et al. (1998), and Orenstein and 
Millis (2000), but an universally accepted theory has 
not yet emerged. This article focuses on a particular 
theory, which unifies the AF and the SC phases of 
the HTSC cuprates based on an approximate SO(5) 


symmetry (Zhang 1997). The SO(5) theory draws 
its inspirations from the successful application of 
symmetry concepts in theoretical physics. All funda- 
mental laws of Nature are statements about sym- 
metry. Conservation of energy, momentum, and 
charge are direct consequences of global symmetries. 
The form of fundamental interactions is dictated by 
local gauge symmetries. Symmetry unifies appar- 
ently different physical phenomena into a common 
framework. For example, electricity and magnetism 
were discovered independently, and viewed as 
completely different phenomena before the nine- 
teenth century. Maxwell's theory, and the under- 
lying relativistic symmetry between space and time, 
unify the electric field E and the magnetic field B 
into a common electromagnetic field tensor Fv. 
This unification shows that electricity and magnet- 
ism share a common microscopic origin, and can be 
transformed into each other by going to different 
inertial frames. As discussed previously, the two 
robust and universal ordered phases of the HTSC 
are the AF and the SC phases. The central question 
of HTSC concerns the transition from one phase to 
the other as the doping level is varied. The SO(5) 
theory unifies the 3D AF order parameter 
(Nx, Ny, Nz) and the 2D SC order parameter 
(ReA, ImA) into a single, 5D order parameter called 
"superspin," in a way similar to the unification of 
electricity and magnetism in Maxwell's theory: 


0 j~ A 

Fu = Ex » €» na = Ny [1] 
E, B, 0 “i 
E. —B, B, 0 eng 


This unification relies on the postulate that a 
common microscopic interaction is responsible for 
both AF and SC in the HTSC cuprates and related 
materials. A well-defined SO(5) transformation 
rotates one form of the order into another. Within 
this framework, the mysterious transition from the 
AF and the SC as a function of doping is explained 
in terms of a rotation in the 5D order parameters 
space. Symmetry principles are not only fundamen- 
tal and beautiful, they are also practically useful in 
extracting information from a strongly interacting 
system, which can be tested quantitatively. The 
approximate SO(5) symmetry between the AF and 
the SC phases has many direct consequences, which 
can be, and some of them have been, tested both 
numerically and experimentally. 

The commonly used microscopic model of the 
HTSC materials is the repulsive Hubbard model, 
which describes the electronic degrees of freedom in 


the CuO, plane. Its low-energy limit, the t — J model 
is defined by 
H=-t » (ci (x)e, (x) + h.c.) 


(x,x") 


+J 3, S(x) - S(x’) [2] 
(x) 


where the term + describes the hopping of an 
electron with spin o from a site x to its nearest 
neighbor x’, with double occupancy removed, and 
the / terms describe the nearest-neighbor exchange 
of its spin S. The main merit of these models does 
not lie in the microscopic accuracy and realism, but 
rather in the conceptual simplicity. However, 
despite their simplicity, these models are still very 
difficult to solve, and their phase diagrams cannot 
be compared directly with experiments. The idea of 
the SO(5) theory is to derive an effective quantum 
Hamiltonian on a coarse-grained lattice, which 
contains only the superspin degrees of freedom. 
The resulting SO(5) quantum nonlinear o-model is 
much simpler to solve using the standard field 
theoretical techniques, and the resulting phase 
diagram can be compared directly with 
experiments. 


SO(4) Symmetry of the Hubbard Model 


Before presenting the full SO(5) theory, let us first 
discuss a much simpler toy model, namely the 
negative U Hubbard model, which has an SC 
ground state with s-wave pairing. However, it also 
has a charge-density-wave (CDW) ground state at 
half-filling. The competition between CDW and the 
SC states is similar to the competition between AF 
and SC states in the HTSC cuprates. In the negative 
U Hubbard model, the CDW/SC competition can be 
accurately described by a hidden symmetry, namely 
the SO(4) symmetry of the Hubbard model. 

The Hubbard model is defined by the Hamiltonian 


where c,(x) is the fermion operator and n,(x)= 
cl(x)c,(x) is the electron density operator at site x 
with spin c, t, U, and y are the hopping, interaction, 
and the chemical potential parameters, respectively. 
The Hubbard model has a pseudospin SU(2) symmetry 
generated by the operators 


High T. Superconductor Theory 647 
n —-M(-YeG)e(x, i ot = (nr)! 


1 1 l 
y= 2 > (ns 7 5) n^, m^] = iesg,m 


where nt =n in” and a=x,y,z. The model is 
defined on any bipartite lattice, and the lattice 
function (—)* takes the value 1 on even sublattice 
and —1 on odd sublattice. These operators commute 
with the Hubbard Hamiltonian at half-filling when 
|. — 0, that is, [H,7*]=0; therefore, they form the 
symmetry generators of the model (Yang and Zhang 
1990). Combined with the standard SU(2) spin 
rotational symmetry, the Hubbard model enjoys an 
SO(4) = SU(2) ® SU(2)/Z2 symmetry. This symme- 
try has important consequences in the phase 
diagram and the collective modes in the system. In 
particular, it implies that the SC and CDW orders 
are degenerate at half-filling. The SC and the CDW 
order parameters are defined by 


A^ =X e(x)e(x). 


|4] 


AF (A7! 
i [5] 
At=5 9 (1na), M,A] = ieu A^ 


where A*=A*+iA”. The last equation of [5] 
shows that the 7 operators perform the rotation 
between the SC and CDW order parameters. Thus, 
n° is the pseudospin generator and A^ is the 
pseudospin order parameter. Just like the total spin 
and the Neel order parameter in the AF Heisenberg 
model, they are canonically conjugate variables. 
Since [H,n*]=0 at 4-0, this exact pseudospin 
symmetry implies the degeneracy of SC and CDW 
orders at half-filling. 

The phase diagram of the U < 0 Hubbard model 
is identical to the phase diagram of the AF 
Heisenberg model in a uniform magnetic field. If 
the AF order parameter originally points along the 
z-direction, a magnetic field applied along the 
z-direction causes the AF order parameter to flop 
into the xy-plane. This transition is called the spin- 
flop transition, and is depicted in Figures 2a and 2c. 
The chemical potential yz in the negative U Hubbard 
model plays a role similar to the magnetic field in 
the AF Heisenberg model. It transforms a CDW 
state at half-filling to an SC state away from half- 
filling, as depicted in Figures 2a and 2c. 

In the low-energy sector, both the AF Heisenberg 
model in a magnetic field and the negative-U 
Hubbard model with a chemical potential can be 
described by the SO(3) nonlinear o-model, which is 
defined by the following Lagrangian density (in 
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Figure 2 The spin-flop transition. (a) The spin-flop transition of the AF Heisenberg model. When a uniform magnetic field is applied 
along the direction of the AF moments, there is no net gain of the Zeeman energy. Therefore, after a critical value of the magnetic field, 
the AF spin component flops into the xy-plane, while a uniform spin component aligns in the direction of applied magnetic field. (b) The 
Mott insulator to superfluid transition of the hardcore boson model or the U < 0 Hubbard model. At half-filling, one possible state is the 
CDW state of ordered boson pairs. Upon doping, the pairs become mobile and form the superfluid state. (c) Both transitions can be 
described by the spin or the pseudospin flop in the SO(3) nonlinear o-model, induced either by the magnetic field or by the chemical 


potential. 


imaginary time coordinates) for a unit vector field 
n, with n2 — 1: 


2c y IP T Yd 
L = “ap T 2 (Oia) 十 V(n) [6] 


Wag = nal Ong PA iBgyny) _ (a -— B) 


where the magnetic field, or equivalently the chemical 
potential, is given by B, —(1/2)e,5,B,. x and p are 
the susceptibility and stiffness parameters, and V (n) is 
the anisotropy potential, which can be taken as 
V(n) = —(g/2)n2. Exact SO(3) symmetry is obtained 
when g=B,=0. g >0 corresponds to easy axis 
anisotropy, while g < 0 corresponds to easy plane 
anisotropy. In the case of g > 0, there is a phase 
transition as a function of B; with B, — B, — 0. To see 
this, let us expand out the first term in [6]. The time- 
independent part contributes to an effective potential 


2 

Vac = Vin) - S (d nd) 
from which we see that there is a phase transition at 
Ba =/g/x. For B < Ba, the system is in the Ising 
phase, while for B > Ba, the system is in the XY 
phase. Therefore, tuning B for a fixed g > 0 leads to 
the spin-flop transition. In D —2, both the XY and 
the Ising phase can have a finite-temperature phase 
transition into the disordered state. However, 
because of the Mermin-Wagner theorem, a finite- 
temperature phase transition is forbidden at the 
point B=g=0, where the system has an enhanced 
SO(3) symmetry. This SO(3) symmetric point leads 
to a large regime below the mean field transition 
temperature where the fluctuation dominates. This 
large fluctuation regime can be identified as the 
pseudogap behavior. 

The pseudospin SU(2) symmetry of the negative-U 
Hubbard model has another important consequence. 
Away from half-filling, the 7 operators no longer 
commute with the Hamiltonian, but they are 
eigenoperators of the Hamiltonian, in the sense that 


|H, n] = 2 pn [7| 


This means that the 7 operators create well-defined 
collective modes with energy 2u. Since they carry 
charge +2, they usually do not couple to any 
physical probes. However, in an SC state, the SC 
order parameter mixes the 7 operators with the 
CDW operator AV, via eqn [5]. From this reasoning, 
a pseudo-Goldstone mode was predicted to exist in 
the density response function at wave vector (m,r) 
and energy 24, which appears only below the SC 
transition temperature Te. 


Unification of Antiferromagnetism and 
Superconductivity through the SO(5) 
Theory 


Order Parameters and SO(5) Group Properties 
The negative U Hubbard model and the SO(3) 


nonlinear o-models discussed in the previous section 
give a nice description of the quantum phase 
transition from the Mott insulating phase with 
CDW order to the SC phase. On the other hand, 
these simple models do not have enough complexity 
to describe the AF insulator at half-filling and the SC 
order away from half-filling. Therefore, a natural 
step is to generalize these models so that the Mott 
insulating phase with the scalar CDW order para- 
meter is replaced by a Mott insulating phase with 
the vector AF order parameter. The pseudospin 
SO(3) symmetry group considered previously arises 
from the combination of one real scalar component 
of the CDW order parameter with one complex, or 
two real components of the SC order parameter. 
After replacing the scalar CDW order parameter by 
the three components of the AF order parameter, 
and combining with the two components of the SC 
order parameters, we are naturally led to consider a 
five-component order parameter vector, and the 
SO(5) symmetry group which transforms it. 

It is simplest to define the concept of the SO(5) 
symmetry generator and order parameter on two 
sites with fermion operators c, and d,, respectively, 


where o = 1,2 is the usual spin index. The AF order 
parameter operator can be naturally defined in 
terms of the difference between the spins of the c 
and d fermions as follows: 


N° = $(c'r*^c — d'7^d), m =N; 


[8] 
123) = N3, na = N; 


where 7® are the Pauli matrices. In view of the strong 
on-site repulsion in the cuprate problem, the SC order 
parameter should be naturally defined on a bond 
connecting the c and d fermions, explicitly given by 


A! = 7 dd! =5(—cldl +eld!), 
9 
LN tA) 'tAT — 29 d 
ni 竺 一 一 一 一 一 ， ns 三 -一 
2 21 


We can group these five components together to 
form a single vector na = (71, n2, n3, n4, ns) called the 
superspin, since it contains both superconducting 
and antiferromagnetic spin components. The indivi- 
dual components of the superspin are explicitly 
defined in the last parts of eqns [8] and [9]. 

The concept of the superspin is only useful if there 
is a natural symmetry group acting on it. In this 
case, since the order parameter is 5D, it is natural to 
consider the most general rotation in the 5D order 
parameter space spanned by n4. In 3D, three Euler 
angles specify a general rotation. In higher dimen- 
sions, a rotation is specified by selecting a plane and 
the angle of rotation within this plane. Since there 
are n(n — 1)/2 independent planes in » dimensions, 
the group SO(n) is generated by n(n — 1)/2 ele- 
ments, specified in general by  antisymmetric 
matrices Lap = — Lpa, with a — 1,...,7. In particular, 
the SO(5) group has ten generators. The total spin 
and the total charge operator 


Sa =$ (ci Tac + d'rad) 
Q = (ce d!d —2) 


perform the function of rotating the AF and SC 
order parameters within each subspace. In addition, 
there are six so-called m operators, defined by 


[10] 


ni Ex — ier Td, Ta (xt)! [11] 


which perform the rotation from AF to SC and vice 
versa. These infinitesimal rotations are defined by 
the commutation relations 


[xi Ng] = ibaph', [r., Al — iN [12] 
The ten operators, the total spin Sa, the total charge 
O, and the six m operators form the ten generators 
of the SO(5) group. 
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The superspin order parameter na, the associated 
SO(5) generators L,;,, and their commutation relations 
can be expressed compactly and elegantly in terms of 
the SO(5) spinor and the five Dirac I matrices. The 
four-component SO(5) spinor is defined by 


v-(&) k 


They satisfy the usual anticommutation relations 


(i. Wy} mt Onw, zT v,j T (vi yii = —0 [14] 


p’ 
Using the V spinor and the five Dirac T matrices, we 
can express na and Lap as 

ivy! De ub. 


L^ v 


L,--$W9iTw, [15] 


jiv 


The Lap operators satisfy the commutation relation 
[Lab Leal - —i(óscLpa + ba Lac — Sad Lbc i SccLad) [16] 


The na and the Y, operators form the vector and the 
spinor representations of the SO(5) group, satisfying 
the following equations: 


Lob ne] -— —i( acp "Y Óp. fa) [17] 


and 


[Lap V,] = - ir*hw, [18] 


m 


If we arrange the ten operators Sa, Q, and 7, into 
L,»’s by the following matrix form: 


0 
T1 + nx 0 
L= | ty -$ 0 [19] 
n HT Sy —S, 0 


Q (si — my) ti 


and group na as in eqns [8] and [9], we see that eqns 
[16] and [17] compactly reproduce all the commuta- 
tion relations worked out previously. These equations 
show that La, and na are the symmetry generators 
and the order parameter vectors of the SO(5) theory. 
Having introduced the concept of local symmetry 
generators and order-parameter-based sites in real 
space, we now proceed to discuss definitions of 
these operators in momentum space. The AF and 
dSC order parameters can be naturally expressed in 
terms of the microscopic fermion operators as 


a a -i 
N 223.7. Cp, M=>) dí(p)er'c., 
p p 


d(p) =cospx 


My)! 3 +(xi—7m,) 0 


[20] 
— cosp, 
where II =(z,7) and d(p) is the form factor for 


d-wave pairing in 2D. They can be combined into 
the five-component superspin vector na by using the 
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Table 1 Quantum number of the AF and the dSC order 
parameters, and the 7 operator, which rotates the AF and the 
dSC order parameters into each other 


Charge Spin Momentum Internal angular 


momentum 
A, At or +2 0 0 d-Wave 
Th, Ns 
N* or 0 1 (x, 7) s-Wave 
llo. 3,4 
Ta; Wh +2 1 (7, 7) d-Wave 


same convention as before. The total spin and total 
charge operator are defined microscopically as 


a l i 
Sa = » 4 Cp; Q= z(c -1) [21] 
and the z-operators can be defined as 


qi = » eae are, [22] 
p 


The form factor g(p) needs to be chosen appro- 
priately to satisfy the SO(5) commutation relation 
[16], and this requirement determines g(p)= 
sgn(d(p)). 

The SO(5) symmetry generators perform the most 
general rotation among the five-order parameters. 
It is easy to see that the quantum number of the 
T Operators exactly patches up the difference in 
quantum numbers between the AF and the dSC 
order parameters, according to Table 1. 


The SO(5) quantum nonlinear c-model 


In the previous section we presented the concept of 
the local SO(5) order parameters and symmetry 
generators. These relationships are purely kinematic, 
and do not refer to any particular Hamiltonian. One 
can in fact construct microscopic models with exact 
SO(5) symmetry out of these operators. A large class 
of models, however, may not have SO(5) symmetry 
at the microscopic level, but their long-distance, 
low-energy properties may be described in terms of 
an effective SO(5) model. In the previous section, we 
have seen that many different microscopic mode]s 
indeed all have the SO(3) nonlinear o-model as their 
universal low-energy description. Similarly, we pre- 
sent the SO(5) quantum nonlinear o-model as a 
general theory of AF and dSC in the HTSC. 

From eqn [17] and the discussions in the previous 
subsection, we see that L,, and na are conjugate 
degrees of freedom, very much similar to [q, p] — ib 
in quantum mechanics. This suggests that we can 
construct a Hamiltonian from these conjugate 
degrees of freedom. The Hamiltonian of the SO(5) 


quantum nonlinear o-model takes the following 
form: 


1 ) 
H -3 al) +5 a n;(x)na(x') 


o > 
T P» Bap(x)Lap(x) T > V(n(x)) [23] 


where the ną vector field is subjected to the 
constraint 


n: —1 [24] 


a 


This Hamiltonian is quantized by the canonical 
commutation relations [16] and [17]. Here, the first 
term is the kinetic energy of the SO(5) rotors, where 
x has the physical interpretation of the moment of 
inertia of the SO(5) rotors. The second term 
describes the coupling of the SO(5) rotors on 
different sites, through the generalized stiffness p. 
The third term introduces the coupling of external 
fields to the symmetry generators, while the V(m) 
can include anisotropic terms to break the SO(5) 
symmetry to the SO(3) x U(1) symmetry. The SO(5) 
quantum nonlinear o-model is a natural combina- 
tion of the SO(3) nonlinear o-model describing the 
AF Heisenberg model and the quantum XY model 
describing the SC to insulator transition. If we 
restrict to the values 4—2,3,4, then the first two 
terms describe the symmetric Heisenberg model, the 
third term describes easy plane or easy axis 
anisotropy of the Neel vector, while the last term 
represents the coupling to the uniform external 
magnetic field. On the other hand, for a= 1,5, the 
first term describes Coulomb or capacitance energy, 
the second term is the Josephson coupling energy, 
while the last term describes coupling to external 
chemical potential. 

The first two terms of the SO(5) model describe 
the competition between the quantum disorder and 
classical order. In the ordered state, the last two 
terms describe the competition between the AF and 
the SC order. Let us first consider the quantum 
competition. The first term prefers sharp eigenstates 
of the angular momentum. At an isolated site, C = 
> L2, is the Casimir operator of the SO(5) group, in 
the sense that it commutes with all the SO(5) 
generators. The eigenvalues of this operator can be 
determined completely from group theory; they are 
0, 4, 6, and 10, respectively, for the 1D SO(5) 
singlet, SD SO(5) vector, 10D antisymmetric 
tensor, and 14D symmetric, traceless tensors. There- 
fore, we see that this term always prefers a 
quantum-disordered SO(5) singlet ground state, 
which is a total spin singlet. This ground state is 
separated from the first excited state, the fivefold 


SO(5) vector state with an energy gap of 2/x. This 
gap will be reduced, when the different SO(5) rotors 
are coupled to each other by the second term. This 
term represents the effect of stiffness, which prefers 
a fixed direction of the n, vector, rather than a fixed 
angular momentum. This competition is an appro- 
priate generalization of the competition between the 
number sharp and phase sharp states in a super- 
conductor and the competition between the classical 
Neel state and the bond or plaquette singlet state in 
the Heisenberg AF. The quantum phase transition 
occurs near xp ~ 1. 

In the classically ordered state, the last two 
anisotropy terms compete to select a ground state. 
To simplify the discussion, we can first consider the 
following simple form of the static anisotropy 
potential: 


V(n) =—g(n3 + n5 + n4) [25] 


At the particle-hole symmetric point with vanishing 
chemical potential Bi1s = u= 0, the AF ground state 
is selected by g » 0, while the SC ground state is 
selected by g < 0 coupled with the constraint n* = 1. 
g=0 is the quantum phase transition point separat- 
ing the two ordered phases. 

However, it is unlikely that the HTSC cuprates 
can be close to this quantum phase transition point. 
In fact, we expect the anisotropy term g to be large 
and positive, so that the AF phase is strongly 
favored over the SC phase at half-filling. However, 
the chemical potential term has the opposite, 
competing effect favoring SC. To see this, we 
transform the Hamiltonian into the Lagrangian 
density (in imaginary time coordinates) in the 
continuum limit: 


fs X wala, t)? +5 (yr (x. t))? + V(n(x,t)) [26] 


where 
Wab = na( Om 一 LBpc7zc) —{a— b) [27] 


is the angular velocity. We see that the chemical 
potential enters the Lagrangian as a gauge coupling 
in the time direction. Expanding the time derivative 
term, we obtain an effective potential 
2 

Veg(n) = V(n) 一 ATX (qd +15) [28] 
from which we see that the V term competes with 
the chemical potential term. For ps < pe = \/g/x, the 
AF ground state is selected, while for > pe, the SC 
ground state is realized. At the transition point, even 
though each term strongly breaks SO(5) symmetry, 
the combined term gives an effective static potential 
which is SO(5) symmetric, as we can see from [28]. 
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Even though the static potential is SO(5) symmetric, 
the full quantum dynamics is not. This can be most 
easily seen from the time-dependent term in the 
Lagrangian. When we expand out the square, the 
term quadratic in J enters the effective static 
potential in eqn [28]. However, there is also a 
time-dependent term linear in jj. This term breaks 
the particle-hole symmetry, and it dominates over 
the second-order time derivative term in the nı and 
ns variables. In the absence of an external magnetic 
field, only second-order time derivative terms of 
n2 3.4 enter the Lagrangian. Therefore, while the 
chemical potential term compensates the anisotropy 
potential in eqn [28] to arrive at an SO(5) symmetric 
static potential, its time-dependent part breaks the 
full quantum SO(5) symmetry. This observation 
leads to the concept of the projected, or static 
SO(5) symmetry (Zhang et al. 1999). A model with 
projected or static SO(5) symmetry is described by a 
quantum effective Lagrangian of the form 


fre: 3 》 (Anta)? — xu(m ns — nsàmi) 
a=2,3,4 


一 Vett(7z) [29] 


where the static potential Vor is SO(5) symmetric, 
but the time-dependent part contains a first-order 
time derivative term in nı and ns. 

The SO(5) quantum nonlinear o-model is con- 
structed from two canonically conjugate field 
operators Lap and na. In fact, there is a kinematic 
constraint among these field operators: 


Labe ME Lp; 7 Lan, = 0 [30] 


This identity is valid for any triples a, b, and c, and 
can be easily proved by expressing L,,-—7n;p,— 
nppa, where p, is the conjugate momentum of na. 
Geometrically, this identity expresses the fact that 
Lap generates a rotation of the ns vector. The 
infinitesimal rotation vector lies on the tangent 
plane of the four sphere S*, and is therefore 
orthogonal to the na vector itself. 

In a large class of materials, including the high-T, 
cuprates, the organic superconductors, and the heavy 
fermion compounds, the AF and SC phases occur in 
close proximity to each other. The SO(5) theory is 
developed based on the assumption that these two 
phases share a common microscopic origin and should 
be treated on an equal footing. The SO(5) theory gives 
a coherent description of the rich global phase diagram 
of the high-T, cuprates and its low-energy dynamics 
through a simple symmetry principle and a unified 
effective model based on a single quantum Hamilto- 
nian. A number of theoretical predictions, including 
the intensity dependence of the neutron resonance 
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separation 


(a) (b) 


Hc1 Mee Ac0 
Uniform AF/SC 


(c) 


Figure 3 The finite-temperature phase diagram of the SO(5) model in the temperature (7) versus chemical potential (1) plane. (a) and 
(b) are two different representations of the same phase diagram, corresponding to a direct first-order phase transition between AF and 
SC, as a function of the chemical potential and doping, respectively. (c) corresponds to two second-order phase transitions with a uniform 
AF/SC mix phase in between. The AF and the SC transition temperatures Ty and T, merge into a bicritical Ty; or a tetra-critical point Tic. 
Both possibilities are allowed theoretically; it is up to experiments to determine which one is actually realized in the high- T. cuprates. 


mode, the AF vortex state, and the mixed phase of AF 
and SC, have been verified experimentally (Figure 3). 
The theory also sheds light on the microscopic 
mechanism of superconductivity and quantitatively 
correlates the AF exchange energy with the condensa- 
tion energy of superconductivity. However, the theory 
is still incomplete in many ways and lacks full 
quantitative predictive power. While the role of 
fermions is well understood within the exact SO(5) 
models, their roles in the effective SO(5) models are 
still not fully worked out. As a result, the theory has 
not made many predictions concerning the transport 
properties of these materials. 


See also: Abelian Higgs Vortices; Effective Field 
Theories; Euclidean Field Theory; Ginzburg-Landau 
Equation; Hubbard Model; Quantum Phase Transitions; 
Quantum Spin Systems; Quantum Statistical Mechanics: 
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Introduction 
Subject 


Holomorphic dynamics (in a narrow sense) is a 
theory of iterates of rational endomorphisms of the 
Riemann sphere C =C U [oc]. The goal is to under- 
stand the phase portrait of this dynamical system, 
that is, the structure of its trajectories, and the 
dependence of the phase portrait on parameters 
(coefficients of f). 


Overview; Renormalization: General Theory; 
Renormalization: Statistical Mechanics and Condensed 
Matter; Superfluids; Symmetry Classes in Random Matrix 
Theory; Variational Techniques for Ginzburg-Landau 
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Holomorphic dynamics in a broader sense would 
include the theory of analytic transformations, local 
and global, in dimension 1 and higher, as well as the 
theory of groups and pseudogroups of analytic 
transformations, which would cover theory of 
Kleinian groups and holomorphic foliations. How- 
ever, we will mostly focus on holomorphic dynamics 
in the narrow sense. 


Brief History 


Local dynamical theory of analytic maps was laid 
down in the late nineteenth and early twentieth 
century by Königs, Schröder, Böttcher, and Leau. 
Global theory of iterates of rational maps was 
founded by Fatou and Julia in comprehensive 
memoires of 1918-19. The theory had been 
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developed very little since then until early 1980s 
when it exploded with new methods, ideas, and 
computer images. Particularly influential were the 
works of D Sullivan who introduced ideas of 
quasiconformal deformations into the field, of 
A Douady and J Hubbard who gave a comprehensive 
combinatorial description of the Mandelbrot set, 
and W Thurston who linked holomorphic 
dynamics to three-dimensional hyperbolic geome- 
try bringing to the field ideas of geometrization and 
rigidity. As a result, profound rigidity conjectures 
were formulated. Renormalization ideas introduced to 
the theory later on led to a significant progress 
towards these conjectures (see Universality and 
Renormalization). 

Another source of ideas came from ergodic theory 
and the general theory of dynamical systems, 
particularly hyperbolic dynamics and thermodyna- 
mical formalism. They led to constructions of 
natural geometric measures on the Julia sets that 
helped to penetrate into their fractal nature. 


General Terminology and Notations 


N — (1,2,...] is the set of natural numbers; D is the 
unit disk; Z, =N U {0}; T = D. 

A topological disk is a simply connected domain 
in C. A topological annulus is a doubly connected 
domain in C (i.e., a domain homeomorphic to a 
round annulus) A Cantor set is a totally 
disconnect compact subset of R” without isolated 
points. 

Given a map f : X —^ X, f” will stand for its n-fold 
iterate. The semigroup of iterates form a dynamical 
system with discrete time. An orbit or trajectory of a 
point z is orb;(z) = (f"z);. 0: 

A subset Y C C is called invariant if f(Y) C Y and 
completely invariant if also f * (Y) C Y. 

A point o € C is called periodic if f^o — ^o for 
some natural p. The smallest such p is called the 
period of a. If p=1, then a is called a fixed 
point. The orbit of a periodic point is also called a 
cycle. 

Two maps f : X — X and g: Y — Y on topological 
spaces X and Y are called topologically conjugate if 
there exists a homeomorphism 5: X — Y such that 
bof —gob.It b has better regularity properties, for 
example, it is quasiconformal/conformal/affine, then 
f and g are called quasiconformally/conformally/ 
affinely conjugate. 

Let f(z) = P(z)/ Q(z) be a rational function viewed 
as a map C—C. Its topological degree degf = 
4f (z,z€€C, (where the preimages of z are 
counted with multiplicity), is equal to the algebraic 
degree max(deg P, deg O). The dynamics of f is very 


simple in degree 1, so in what follows we assume 
that deg f > 2. 

Let C; = (c: Df (c) = 0] stand for the set of critical 
points of f, and V; —f(C;) be the set of critical 
values. A rational function of degree d has 2d — 2 
critical points counted with multiplicity. Moreover, 


n—1 n 
Ce =LI O ^ Ve =U FC) 
k=0 k=1 


The latter formula explains why the behavior of the 
critical orbits crucially influences the global dynamics 
of f. The set Of = UV. is called postcritical. 


Basic Dynamical Theory 
Local Theory 


The local theory describes the dynamics of an 
analytic map f:z Az + 5, 5a4z" near its fixed 
point 0. The derivative À — f'(0) is called the multi- 
plier of 0. The fixed point is called attracting, 
repelling, or neutral, depending on whether |A| < 1, 
IA| > 1, or |A| 2 1. It is called superattracting if 
A=, 

In case when 0 is an attracting (but not super- 
attracting) or repelling fixed point, the map is lineariz- 
able, that is, it is conformally conjugate to its linear part 
z — Az; thus, there is a local conformal solution of the 
Schröder equation $(fz) = Aó(z). This solution is also 
called the linearizing coordinate near 0. 

In the superattracting case, the map is confor- 
mally conjugate to the map z — zł, where ddzd is the 
first nonvanishing term in the local expansion of f. 
Thus, in this case there is a local conformal solution 
of the Böttcher equation ó(fz) — o(z)?. It is also 
called the Böttcher coordinate near 0. 

The situation in the neutral case (when 
A=e™, 0cmR/Z) depends in a delicate way on 
the arithmetic properties of the rotation number 0. If 
0—qg/p is rational, the fixed point 0 is called 
parabolic. The local dynamics is then described in 
terms of the Leau-Fatou flower consisting of 
attracting petals alternating with repelling petals. 
In each petal, the map is conformally conjugate to 
the translation z — z 4- 1. The quotients of the petals 
by dynamics are conformally equivalent to the 
cylinder C/« z — z + 1». They are called (attracting/ 
repelling) Ecalle- Voronin cylinders. 

In the irrational case, when 0 € R\Q, the map can 
be either linearizable or not. Accordingly, 0 is called a 
Siegel or a Cremer fixed point. If the multiplier is 
Diopbantine (i.e., there exist C > 0 and o > 2 such 
that for all rational numbers q/p, we have: 
|9 — q/p| > Cp), then 0 is linearizable (Siegel 1942). 
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Notice that almost all numbers are Diophantine. 
A sharper arithmetic condition for linearizability in 
terms of the continuous fraction expansion for 0 was 
given by Bruno (1965). In the quadratic case, 
z e^» + 27. this condition was proved to be sharp 
(Yoccoz 1988). 


Fatou and Julia Sets 


From now on, f : C 2 C is a rational endomorphism 
of the Riemann sphere. The theory starts with the 
splitting of the sphere into two subsets now called 
Fatou and Julia sets based on the notion of a normal 
family in the sense of Montel. A family (ġa: S — C) 
of meromorphic functions on some Riemann surface 
S is called normal if it is precompact in the open- 
closed topology. The Fatou set F(f) is the maximal 
open subset of C on which the family of iterates 
(f") .g is normal. The Julia set J(f) is the comple- 
ment of the Fatou set. Both sets are completely 
invariant. The Julia set is always nonempty, and is 
either nowhere dense or coincides with the whole 
sphere. The trajectories on the Fatou set are 
Lyapunov stable (if z is close to zo € F(f), then 
orb;(z) is uniformly close to orb;(zo)), while the 
dynamics on the Julia set is *chaotic." 

If f is a polynomial, then the Fatou and Julia sets 
can be defined in a more concrete way as follows. In 
this case, oo is a superattracting fixed point for f. Let 
us consider its basin of attraction, 


D,(oo) = {z : f"z —^ oo as z ^ oo] 


Its complement, K(f), is called the filled Julia set. 
Then, 


J(f) = OK(f) = 8D,(oc) 


Periodic Points 


Let a be a periodic point of f of period p. As a fixed 
point of f^, it is subject of the local theory. Thus, it 
(and its cycle) is classified as attracting, repelling, 
etc., according to the properties of the multiplier 
à= (f^)' (a) (that can be calculated in amy local chart 
near a). : 

The basin of attraction D;(@) of an attracting 
cycle a= (faf. is the set [z:/"z —^ 0 as n — oc]. 
The immediate basin of attraction Dr(a) is the 
union of components of Dr(w) containing the points 
of a. 


Theorem 1 (Fatou-]ulia). The immediate basin of 
any attracting cycle contains a critical point. (Note 
that a superattracting cycle actually contains some 
critical point.) 


It follows that a rational function of degree d has 
at most 2d —2 attracting cycles. A polynomial of 
degree d has at most d — 1 attracting cycles in C. 

Attracting cycles belong to the Fatou set, while 
repelling cycles lie on the Julia set. Parabolic and 
Cremer points lie on the Julia set, while Siegel points 
belong to the Fatou set. The basin of attraction of a 
parabolic cycle @ is defined as 


D(a) = iz: f"z — gas n 118 dae (^4 
n=() 


It is the union of some components of the Fatou set. 
The union of the components of Dr(w) containing 
the petals of the Leau-Fatou flower is called the 
immediate basin of attraction D*(a@) of a. As in the 
attracting case, the immediate basin D;(a) of a 
parabolic cycle contains a critical point of f. 
Components of the Fatou set containing Siegel 
periodic points are called Siegel disks. If D is a Siegel 
disk of period p, then f?|D is conformally conjugate 
to the irrational rotation z — e?™®z of the unit disk. 


Theorem 2 (Shishikura 1987). A rational function 
of degree d bas at most 2d — 2 nonrepelling cycles. 


The proof of this result uses the methods of 
quasiconformal surgery. 


Examples 


For f:z++ 24,d > 2, the Julia set /(f) is the unit circle. 
Moreover, D,(oo) =C\D, while D is the basin of 
attraction of the superattracting fixed point 0. 

For maps f.:z-—z^--& with sufficiently small 
€ X 0, the Julia set J( f) is a nowbere-differentiable 
Jordan curve (see Figure 1). The domain bounded 
by this curve is the basin of attraction of an 
attracting fixed point a.. 

The filled Julia set of the map f :z — z^ — 1 called 
the basilica is depicted in black in Figure 2. The 


Figure 1 Nowhere-differentiable Jordan curve. 


Figure 2 Basilica. 


interior of the basilica is the basin of the super- 
attracting cycle a = (0, 1) of period 2. 

For the map f:z—z^-2, the Julia set is the 
interval [— 2, 2]. It is affinely conjugate to the Cheby- 
shev quadratic polynomial Ch?:z- 2z* — 1. More 
generally, for a Chebyshev polynomial Ch, of any 
degree d, the Julia set is the interval. (By definition, the 
Chebyshev polynomial Ch, is the solution of the 
functional equation cos dz = Ch,( cos z).) 

For quadratic maps f.:z — z^ + c with c < 2, the 
Julia set is a Cantor set on R. For maps fe with 
c > 1/4, the Julia set is a Cantor set that does not 
meet R. For c € (-2, 1/4], the Julia set contains an 
invariant interval on R, but is not contained in R. 

For f:z++z* +i, the Julia set is a “dendrite” (see 
Figure 3). 

For c œ 0.12 + 0.74i, the map f : z= z^ + c has an 
attracting cycle of period 3. Its Julia set in known as 
the Douady rabbit (Figure 4). 


No Wandering Domains Theorem and Dynamics 
on the Fatou Set 


A component D of the Fatou set is called wandering 
if f"(D) n f"(D) — ( for all natural n > m. 


- 


Figure 3 Dendrite. 
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Figure 4 Douady rabbit. 


Theorem 3 (Sullivan 1982). 
not bave wandering domains. 


Rational functions do 


This theorem is analogous to Ahlfors theorem in 
the theory of Kleinian groups. Its proof introduced 
to holomorphic dynamics the methods of quasicon- 
formal deformations that has become the basic tool 
of the subject. 

The *no wandering domains theorem" has com- 
pleted the picture of dynamics on the Fatou set. 
Namely, for any z € F(f), one of the following three 
events may happen: 


è z belongs to the basin of attraction of some 
attracting cycle; 

* z belongs to the basin of attraction. of some 
parabolic cycle; and 

e for some 7, f"z belongs to a rotation domain. 


Here a rotation domain is either a Siegel disk, or a 
Herman ring, that is, a topological annulus A such 
that f?(A)=A for some p € IN and f^|A is con- 
formally equivalent to an irrational rotation z — e^"'z 
of a round annulus {z: 1 « |z| < R}. Note that 
Herman rings cannot occur for polynomial maps. 


More Properties of the Julia Set 


There are two more useful characterizations of the 


Julia set: 


e If z is not an attracting periodic point and does 
not belong to a rotation domain, then the set of 
accumulation points of the full preimages f "z is 
equal to J(f ). 

e Tbe Julia set is tbe closure of tbe set of repelling 
periodic points. 


In the polynomial case, the Julia set J(f) (and the 
filled Julia set K(f)) is connected if and only if tbe 
critical points do not escape to oo (in other words, 
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Cr C K(f)). In the quadratic case, the Basic Dichot- 
omy holds: the Julia set (and tbe filled Julia set) is 
eitber connected or a Cantor set. 


Bóttcher Coordinate 


Let f=z4 +a,;z4-!+---+ay be a monic polyno- 
mial of degree d > 2. Then oo is a superattracting 
fixed point, and hence there is a univalent function 
B(z) = By(z) near oo satisfying the Böttcher equation 
B(fz) —- B(z) (the Böttcher coordinate near ox). 
Moreover, B(z) ~ z as z— oo since f is monic. 

If J(f) is connected, B(z) can be analytically 
extended to the whole basin of oc, and provides us 
with the Riemann map CVXK(f) — C\D. Otherwise, 
B(z) extends to a conformal map from some 
invariant domain Qy whose boundary contains a 
critical point onto CV Dg, where R= R, > 1. 

The B-preimage of a straight ray {re*™ : 0 < r < oc] 
is called the external ray Ry of angle 0. The B-preimage 
of a round circle [re??^: 0 « 0 « 1] is called the 
equipotential £, of level t= logr. External rays and 
equipotentials form two orthogonal f-invariant folia- 
tions. We let Rg(t) = Ra N E. 


Combinatorial Equivalence 


Assume now that /(f) is connected. One says that an 
external ray Re lands at some point z € J(f) if 
Re(t) +z as t —0. Any external ray of rational 
angle 0 —q/p with odd p lands at some repelling or 
parabolic periodic point of period dividing p 
(Douady and Hubbard 1982). Vice versa, any 
repelling or parabolic point is a landing point of at 
least one rational ray as above (Douady 19905). 

Let us consider the following equivalence relation 
on the set of rational numbers with odd denomi- 
nators: two such numbers 0 and 6' are equivalent if 
the corresponding rays Rg and Ry land at the same 
point z € J(f). Two polynomials f and f with 
connected Julia set are called combinatorially 
equivalent if the corresponding equivalence relations 
coincide. Notice that topologically equivalent poly- 
nomials are combinatorially equivalent. 


Parameter Phenomena 
Spaces of Rational Functions 


Let Rat; stand for the space of rational functions 
of degree d. As an open subset of the complex 
projective space CP?4^*!, it is endowed with the 
natural topology and complex structure. 


Hyperbolic Maps and Fatou's Conjecture 


Hyperbolic maps form an important and best- 
understood class of rational maps (compare with 
Hyperbolic Dynamical Systems). A rational map f is 
called hyperbolic if one of the following equivalent 
conditions holds: 


® All critical points of f converge to attracting 
cycles; 
e The map is expanding on the Julia set: 


. |Df"(z) 2 CX, ze) 


where C » 0, A » 1. 


For instance, the maps z++z*+e for small 
&, zz? — 1, and z2? +c for ce R\[—2,1/4] 
are hyperbolic. It is easy to see from the first 
definition that hyperbolicity is a stable property, 
that is, the set of hyperbolic maps is open in the 
space Rat, of rational maps of degree d. One of the 
central open problems in holomorphic dynamics is 
to prove that this set is also dense. This problem is 
known as Fatou’s conjecture. 


Postcritically Finite Maps and Thurston's Theory 


A rational map is called postcritically finite if the 
orbits of all critical points are finite. In this case, any 
critical point c is either a superattracting periodic 
point, or a repelling preperiodic point (i.e., f"c is a 
repelling periodic point for some n). If all critical 
points of f are preperiodic, then J(f) =C. 

Important examples of postcritically finite maps with 
J(f) - € come from the theory of elliptic functions. 
Namely, let P,:C/T,— C be the Weierstrass 
P-tunction, where I’, is the lattice in C generated by 
1 and 7, Im 7 > 0. It satisfies the functional equation 
P.(nz) — f. ,(P(z)), where frn is a rational function. 
These functions called Lattés examples possess the 
desired properties. (For some special lattices, n can be 
selected complex: the corresponding maps are also 
called Lattés.) 

More generally, one can consider postcritically 
finite topological branched coverings f : $? — S?. Two 
such maps, f and g, are called Thurston combina- 
torially equivalent if there exist homeomorphisms 
b,b':(S*,O;)— (S?, O,) homotopic relO (and 
hence coinciding on Of) such that hb’ o f =h o g. 

A combinatorial class is called realizable if it 
contains a rational function. Thurston (1982) gave a 
combinatorial criterion for a combinatorial class to 
be realizable. If it is realizable, then tbe realization is 
unique, except for Lattés examples (Thurston’s 
Rigidity Theorem). 


Structural Stability and Holomorphic Motions 


A map f € Rat, is called J-stable if for any maps 
g € Rat; sufficiently close to f, the maps f | J(f) and 
glJ(g) are topologically conjugate, and moreover, 
the conjugacy ^, : J(f) — J(g) is close to id. Thus, the 
Julia set /(f) moves continuously over the set of 
J-stable maps. The following result proves a weak 
version of Fatou's conjecture: 


Theorem 4 (Lyubich and  Mahé-Sad-Sullivan 
1983). Tbe set of J-stable maps is open and dense 
in Raty. Moreover, the set of unstable maps is the 
closure of maps that bave a parabolic periodic point. 


A map f € Rat, is called structurally stable if for 
any maps g € Rat, sufficiently close to f, the maps f 
and g are topologically conjugate on the whole 
sphere, and moreover, the conjugacy b,:C — C is 
close to id. The set of structurally stable maps is also 
open and dense in Rat; (Mafié-Sad-Sullivan). 

The proofs make use of the theory of holomorphic 
motions developed for this purpose but having much 
broader range of applications in dynamics and 
analysis. Let X be a subset of C, and let hy: X — C 
be a family of injections depending on parameter 
AEA in some complex manifold with a marked 
point A,. Assume that 5, — id and that the functions 
Ac by(z) are holomorphic in A for any z € X. Such 
a family of injections is called a holomorphic 
motion. 

A holomorphic motion of any set X over A 
extends to a holomorphic motion of the whole 
sphere C over some smaller manifold A’ C A (Bers- 
Royden, Sullivan-Thurston 1986). If 5b, is a holo- 
morphic motion of an open subset of the sphere, 
then the maps by are quasiconformal (Mafié-Sad- 
Sullivan). These statements are usually referred to as 
the \-lemma. 

If A — D, then the holomorphic motion of a set 
X C C extends to a holomorphic motion of C over 
the whole disk D (Slodkowsky 1991). 


Fundamental Conjectures 


The above rigidity and stability results led to the 
following profound conjectures: 


QC Rigidity Conjecture If two rational maps are 
topologically conjugate, then they are quasiconfor- 
mally conjugate. 


Let us consider the real projective tangent bundle 
PT over C, with a natural action of the map f. 
A measurable invariant line field on tbe Julia set is 
an invariant measurable section X — PT over an 
invariant set X C J(f) of positive Lebesgue measure. 
In other words, it is a family of tangent lines L} C 
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Tz, z € X, such that Df(L;) — Lg. Note that such a 
field can exist only if /(f) has positive Lebesgue 
measure. 


No Invariant Line Fields Conjecture Let us con- 
sider two rational maps, f and f, that are not Lattés 
examples. If they are quasiconformally conjugate 
and tbe conjugacy is conformal on the Fatou set, 
then they are conjugate by a Móbius transformation. 
Equivalently, if f is not Lattés, then there are no 
measurable invariant line fields on J(f). 


This conjecture would imply Fatou’s conjecture. 


Mandelbrot Set 


Let us consider the quadratic family f.:z--2* + c. 
(Note that any quadratic polynomial is affinely 
conjugate to a unique map 大 .) The Mandelbrot set 
classifies parameters c according to the Basic 
Dichotomy of the subsection “More properties of 
the Julia set”: 


M = 1c: ](f-) is connected} = {c : f%(0)++ oo) 


Note that $,(c) = f”(0) is a polynomial in c of 
degree 2"-!, and these polynomials satisfy a 
recursive relation $41 — + c. Moreover, M = 
(c:lós(c)| € 2, n € Z4}, which gives an easy way to 
make a computer image of M (see Figure 5). 

A distinguished curve seen at the picture of M is 
the main cardioid C= [c — e?" — e$? /4}, 0 e R/Z. 
For such a c—c(0) € C, the map f, has a neutral 
fixed point o, with rotation number 0. For c inside 
the domain Ho bounded by C,f. has an attracting 
fixed point ae, and the Julia set J(f.) is a Jordan 
curve (see Figure 1). 


Figure 5 The Mandelbrot set. 
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At the cusp c=1/4=c(0) of the main cardioid, 
the map f, has a parabolic point with multiplier 1. 
This point is also called the root of C. Other 
parabolic points c=c(q/p) on C are bifurcation 
points: if one crosses C transversally at c, then the 
fixed point a, “gives birth” to an attracting cycle of 
period p. This cycle preserves its “attractiveness” 
within some component H,/, of int M attached to C. 

On the boundary of H,/,, the above attracting 
cycle becomes neutral, and similar bifurcations 
happen as one crosses this boundary transversally, 
etc. In this way we obtain cascades of bifurcations 
and associated necklaces of components of int M. 
The most famous one is the cascade of doubling 
bifurcations that occur along the real slice of M. 

Components of int M that occur in these bifurcation 
cascades give examples of hyperbolic components of 
int M. More generally, a component H of int M is 
called hyperbolic of period p if the maps fec € H, 
have an attracting cycle of period p. Many other 
hyperbolic components become visible if one begins 
to zoom-in into the Mandelbrot set. Some of them 
are satellite, that is, they are born as above by 
bifurcation from other hyperbolic components. 
Others are primitive. They can be easily distin- 
guished geometrically: primitive components have a 
cusp at their root, while satellite components are 
bounded by smooth curves. 

Given a hyperbolic component H, let us consider the 
multiplier A(c), c € H, of the corresponding attracting 
cycle, as a function of c € H. The function ^ univalently 
maps H onto tbe unit disk D (Douady and Hubbard 
1982). Thus, there is a single parameter co € H for 
which A(co) = 0, so that f, has a superattracting cycle. 
This parameter is called the center of H. 

Nonhyperbolic components of int M are called 
queer. Conjecturally, there are no queer compo- 
nents. This conjecture is equivalent to Fatou’s 
conjecture for the quadratic family. 

The boundary of M coincides with tbe set of 
J-unstable quadratic maps (see the subsection 
“Structural stability and holomorphic motions"). 


Connectivity and Local Connectivity 


Theorem 5 (Douady and Hubbard 1982). The 
Mandelbrot set is connected. 


The proof provides an explicit uniformization 
Ry :CMM — CAD. Namely, let Be : Qe — C\ Dr., c € 
C\M, be the Bóttcher coordinate near oo. Then 
Ry(c) — B,(c). This remarkable formula explains the 
phase-parameter similarity between the Mandelbrot 
set near a parameter c € M and the corresponding 
Julia set J(f.) near the critical value c. 


The following is the most prominent open 
problem in holomorphic dynamics: 


MLC Conjecture The Mandelbrot set is locally 
connected. 


If this is the case, then the inverse map Ry 
extends to the unit circle T, and the Mandelbrot 
set can be represented as the quotient of 'T modulo 
certain equivalence relation that can be explicitly 
described. Thus, we would have an explicit topolo- 
gical model for the Mandelbrot set (Douady and 
Hubbard, Thurston). 

The MLC conjecture is equivalent to the follow- 
ing conjecture: 


Combinatorial Rigidity Conjecture If two quadratic 
maps f. and fe with all periodic points repelling are 
combinatorially equivalent, then c — c'. 


In turn, this conjecture would imply, in the 
quadratic case, the above fundamental conjectures. 
For a progress towards the MLC conjecture (see 
Universality in Mathematical Physics). 


Parabolic Implosion 


Parabolic maps f4:z—2^--co are unstable in a 
dramatic way. In particular, the Julia set J(f.) does 
not depend continuously on c near co. Instead, J(f.) 
tends to fill in a good part of int J(f,,). This 
phenomenon called parabolic implosion has been 
explored by Douady, Lavaurs, Shishikura, and many 
others. 


Geometric Aspects 
Area 


One of the basic problems in holomorphic dynamics 
is whether a Julia set that does not coincide with C 
can have positive area. It would give an example of 
“observable chaos” that occurs on a topologically 
small set. It is also related to the No Invariant Line 
Fields Conjecture. 

Maps with strong hyperbolic properties have zero 
area Julia set. A rational map f is called Collet- 
Eckmann if there exist constants C > 0 and A» 1 
such that: 


IDf"(fce)) 2 CA, neN 


for all critical points c. If f is a Collet-Eckmann map 
with J(f)z C, then area J(f)=0 (Przytycki and 
Rohde 1998) (see Universality and Renormalization 
for more examples). On the other hand, A Douady 
has set up a compelling program of constructing a 
Cremer quadratic polynomial f:zı ez + z? 
whose Julia set would have positive area. Buff and 


Cheritat have recently announced that they have 
completed the program, thus constructing the first 
example of a Julia set of positive area. (It makes use 
of a renormalization theorem for parabolic implo- 
sion recently announced by Shishikura.) 

In the parameter plane, it would be interesting to 
know whether the boundary of the Mandelbrot set 
has zero area. 


Hausdorff Dimension 


Hausdorff dimension (HD) gives us a further 
refinement of fractal sets of zero area. Any Julia 
set has positive HD. If f is a polynomial witb 
connected Julia set, then HD(J(f)) » 1 unless f is 
affinely conjugate to zz or a Chebyshev poly- 
nomial (Zdunik 1990). If f is a Collet-Eckmann map 
with J(f) Z C, then HDJ(f) «2 (Przytycki-Rohde 
1998). On the other hand, in the quadratic case 
f.:zez^-cHD((f.))-2 for a generic parameter 
c € 0M. The corresponding parameter result is that 
HD(9M)-—2 (Shishikura 1998). It is based on the 
parabolic implosion phenomenon. 


Conformal Measure 


Let 670. A Borel measure u on C is called 
6-conformal if 


u(f X) = 人 Df? du 


for any measurable set X such that f | X is injective. 


Theorem 6 (Sullivan 1983). Any rational map f 
has a 6-conformal measure with 6 € (0,2] supported 


on J(f). 


This is a dynamical measure that captures well 
geometric properties of J(f). For instance, for Collet- 
Eckmann maps, ó = HD(J(f)), and yz is equivalent to 
the Hausdorff measure on J(f) in dimension 6. 

The hyperbolic dimension, HDyy,, of J(f) is the 
supremum of HD(X) over all compact invariant 
hyperbolic subsets of J(f). Denker and Urbanski 
(1991) proved that HDy,,(J(f)) is equal to the 
smallest exponent 6 of all 6-conformal measures 
supported on J(f) (see Universality and 
Renormalization). 


Measure of Maximal Entropy 


An f-invariant measure p is called balanced it 
ulf X)=du(X) for any measurable set X such that 
f | X is injective (where d = deg f). 


Theorem 7 (Brolin 1965, Lyubich 1982). Any 
rational map f bas a unique balanced measure y. 
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Moreover, preimages of any point z except at most 
two are equidistributed with respect to p (meaning 
that the probability measures p, that assign mass 1 
to every f"-preimage of z converge weakly to u as 
n— oo). 


For polynomials, the balanced measure coincides 
with the harmonic measure on J(f) (Brolin). (The 
latter is the charge distribution on the conductor J(f) 
generated by the unit charge placed at oo.) In 
general, the balanced measure is the unique measure 
of maximal entropy of f, and moreover, periodic 
points are equidistributed with respect to p 
(Lyubich). 

Measure of maximal entropy is supported on a 
relatively small measurable set: its HD is strictly less 
than HD(J(f)), unless f is conformally equivalent to 
z+ z^, a Chebyshev polynomial, or a Lattés example 
(Zdunik 1990). In the polynomial case, it is 
supported on a set of HD at most 1 (Manning 1984). 

In complex analysis, there has been an extensive 
study of fractal properties of harmonic measures, 
providing insights at the balanced measure jz and the 
other way around (Carleson, Makarov, Jones, 
Binder, Smirnoy,.. .) 


See also: Fractal Dimensions in Dynamics; Geometric 
Analysis and General Relativity; Geometric Flows and 
the Penrose Inequality; Geometric Phases; Polygonal 
Billiards; Renormalization: General Theory; 
Renormalization: Statistical Mechanics and Condensed 
Matter; Universality and Renormalization. 
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Introduction 


The term, holonomic field, was coined by Sato, 
Miwa, and Jimbo (SMJ) in 1978 and the subject was 
investigated by them in a series of five long papers 
and many shorter notes in the period from 1978-81 
(Sato et al. 1979a, 1979b, 1979c, 1980, Tracy 
and Widom 1994). The term refers to a special class of 
two-dimensional interacting quantum field theories 
whose n point correlations can be expressed in terms 
of the solution to a holonomic system of differential 
equations. A holonomic system is an overdetermined 
system of differential equations with only a finite- 
dimensional family of solutions. There is a sense in 
which these interacting systems with infinitely many 
degrees of freedom have a finite-dimensional substrate 
(at the level of n point functions for fixed n). After 
developing their theory, SM] realized that such 
quantum fields made an earlier appearance in work 
of Thirring and Federbush. The models considered by 
Thirring and Federbush are self-interacting fermionic 
systems whose nonlinear classical field equations have 
solutions that are an explicit nonlinear transformation 
of solutions to the free field equations. This inspired 
the idea of trying to study these models by “quantiz- 
ing" the nonlinear transformation. Expressions were 
obtained for the correlations and S-matrix but the 
connection with deformation theory was not under- 
stood until the SMJ work. 

In what follows we will sketch the SM] theory and 
discuss some of its offshoots. There is one circum- 
stance that it might help the reader to be aware of 
even though it will be mostly glossed over. Quantum 
fields in one space and one time dimension have 
correlations which transform under the symmetries of 
spacetime with metric signature (1,1). Since the work 
of Osterwalder and Schrader, it is customary to pass 
back and forth between this Minkowski regime and 
the Schwinger functions obtained by analytically 
continuing the z point functions to pure imaginary 
values for the time variable where they possess the 
rotational symmetries associated with a positive- 
definite metric. The Ising model, which we take up 
next, is naturally considered in the Euclidean domain 
where the correlations have an interpretation in 
statistical mechanics as the expected value of a 
product of random variables. Ultimately, the SMJ 
deformation analysis is done in the Euclidean 
domain. 


The Two-Dimensional Ising Model 


The SMJ theory was inspired by, and provides an 
attractive setting for, an earlier result of Wu, 
McCoy, Tracy, and Baruch (WMTB), concerning 
the spin-spin scaling functions of the two-dimen- 
sional Ising model (Wu et al. 1976). Since the Ising 
model is the example with the most direct signifi- 
cance for physics, we will take some time to explain 
the WMTB result and to sketch the way in which it 
fits into the SMJ theory. 

The Ising model is a statistical model of magnest- 
ism on a lattice that incorporates ferromagnetic 
interactions of nearest-neighbor spins. In the 1920s, 
Ising solved the model for the one-dimensional 
lattice and showed that there was no phase transition 
in the infinite volume limit. Interest in the two- 
dimensional model intensified dramatically following 
Onsager's calculation of the specific heat in the 
infinite volume limit (see Palmer and Tracy (1981) 
and references within). His formula for the specific 
heat was the first instance of a thermodynamic 
quantity in a nearest-neighbor model which exhibits 
the sort of discontinuity in temperature dependence 
expected at a phase transition. For many years, the 
Ising model served as a testbed for the now accepted 
notion that the infinite volume limit of Gibbsian 
statistical mechanics provides a suitable setting for 
the study of phase transitions. 

A configuration for the Ising model on a finite 
subset, A, of the integer lattice, Z^, is a map C: 
A—{+1, —1}, which assigns to each site on the 
lattice either an up spin (+1) or a down spin (- 1). 
The energy function of the Ising model, E,(c), is 
defined by 


Ealo) = -J Y o(i)o() 
( 


igvEA 


for J > 0 and a spin configuration o is a sum over 
pairs of nearest-neighbor sites i,j in A (boundary 
terms require special consideration). This energy 
function tends to favor spin configurations, o, in 
which the nearest-neighbor spins are aligned in the 
sense that the Boltzmann weight, e~ FE*(%/*T is larger 
for such configurations. In the Gibbs ensemble, 
which is expected to describe systems in equilibrium 
at temperature T, the configuration o occurs with a 
probability proportional to the Boltzmann weight. 
The factor k which appears is a conversion factor 
between thermal and kinetic energy called the 
Boltzmann constant. It is clear from the formula 
for the Boltzmann weight that small temperatures 
(near 0) tend to accentuate the difference in 


statistical weights assigned to configurations with 
different energies, and large temperatures tend to 
wash out the difference in statistical weights 
associated to configurations with different energies. 

Remarkably, there is a sharp critical temperature 
0 < Te < oo so that for T < T. the propensity for 
order built into the energy triumphs in the infinite 
volume limit A 1 Z?, and for T > T, the randomness 
or disorder associated with high temperatures 
governs the infinite volume behavior. More specifi- 
cally, if T < Te and the infinite volume limit is taken 
with plus spins assigned to the boundary of A, the 
system exhibits a residual magnetism (there is a 
positive expected value, (c), for the spin per site). 
This infinite volume plus state is the quintessential 
example of symmetry breaking - the spin flip 
symmetry possessed by the bulk energy is broken 
below T, in the thermodynamic limit. For T > Te, 
the spin per site is 0 no matter what boundary 
conditions are imposed on the infinite volume limit. 

Pure equilibrium states both above and below T; 
exhibit clustering in the thermodynamic limit 
(uniqueness for the ground state in field theory). 
This is the tendency of spin variables o(a) and o(b) 
at sites a, b € Z^ to become statistically independent 
as the distance |a — b| tends to oo. In such a pure 
state the two-point function, which is the expected 
value of the product of spin variables, (o(a)o(b)), 
will tend to the square (c)* both below ((c) Æ 0) 
and above ((c)— 0) the critical temperature Te as 
la — b| oc. To leading order, this clustering takes 
place at an exponential rate, e ^ P/57) for a 
function £(T) called the correlation length. The 
correlation length £(T) — oo as T — Te. The scaling 
limit (from below Te) of the spin-spin correlation is 
the leading-order correction to the clustering beha- 
vior of the correlations when these correlations are 
examined at the scale of the correlation length. It is 
the limit 


(a(a)o(b)) = lim (o(E(T)a)o((T)b))r 


where the correlations on the right-hand side are 
thermodynamic correlations on the lattice at tem- 
perature T. Since (o), tends to 0 as T — Te, the 
normalization by (cy 了 on the right produces an 
“infinite wave function renormalization” in the limit. 

Equivalently, one may think of this continuum 
limit being achieved by letting the lattice spacing 
shrink to 0 as T approaches T, so that the 
correlation length stays fixed on the new scale. The 
scaling limit from above T, turns out to be different 
from the scaling limit from below T. and since 
(o)r =Q for T > Te, it is defined by a different wave 
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function renormalization. The resulting asymptotics 
are expected to capture quite a lot about what is 
interesting in the behavior of the correlations near 
the phase transition. In the late 1970s, the scaling 
behavior in this model was also a prototype for the 
emerging connection between renormalization group 
ideas in quantum field theory and statistical 
mechanics. 

Wu et al. (1976) showed that the two-point 
scaling function, (c(0)e(x)), is a function of r= |x| 
and can be written as 


x (TTT) 
sinh(/2) 


OQ 2 
X exp; | (sin - (=) E 


XO LT.) 


where 1 — vr) satisfies the differential equation, 
d / d 
X c3 = 5 sinh (2y) 


The substitution 7=e™ transforms this differential 
equation into a Painlevé equation of the third kind. 
This was used by McCoy, Tracy, and Wu (see 
Palmer and Tracy (1981) and references within) to 
study the short-distance behavior, r— 0, of the 
scaling functions — behavior which is far from 
manifest in the infinite series expansions obtained 
for the scaling functions. 


Deformation Theory 


Sato, Miwa, and Jimbo showed that there was a 
class of quantum field theories that included the 
scaling limits of the Ising model which have the 
property that the z-point correlations are “tau 
functions" for monodromy-preserving deformations 
of the Dirac equation in two dimensions. The two- 
dimensional (Euclidean) Dirac operator is 


m  —20 
D^ 3s "Y 


with 
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the usual complex derivatives acting on smooth 
functions on R^. The monodromy-preserving defor- 
mations mentioned above are families of (multivalued) 
solutions w(x) to 


Dw — 0 [2] 


which are branched at points a; € R?(j=1,2,...,n) 
and change by a factor e*™% as x makes a small 
circuit about aj. SMJ (1979b) show that the L?(R7) 
(square-integrable) solutions w(x) of the Dirac 
equation with this prescribed branching behavior 
comprise an n-dimensional subspace of L^(R^). The 
constants e^" ^ are called the monodromy of the 
solutions and the term “deformation” in the descrip- 
tion refers to the fact that the monodromy constants 
do not change as the branch points a; are varied. 
SM] show that it is possible to choose a basis 
dx) = We (Bs v.) = 1,-..,8) so that the 
vector 


W(x, a) := [wi(x, a), 102(x,a),. .. ,wys(x,a)] 


becomes a section for a flat (Dirac compatible) 
connection in the (x,a) variables: That is, 


dya W = Q(x,a)W 


where dx a is the exterior derivative in the x and a 
variables and Q is a matrix-valued 1-form that 
satisfies the zero curvature condition, 


G0 = LAT 


They also introduced the notion of a tau function, 
rla), for such deformations. The logarithmic deri- 
vative d,logr(a)=w, where w is a 1-form on 
R? \ {a1 a2, ... 44) expressed in terms of the matrix 
elements of Q. The 1-form w introduced by SMJ is 
shown to be closed when €? satisfies the zero 
curvature condition above. The scaling limit of 
the Ising model is related to the situation for 
monodromy multipliers —1 and when the scaling 
limits of correlations are identified as suitable 
T-functions in this case, the WMTB result emerges 
when the nonlinear zero curvature condition is 
identified with a Painlevé equation. 

The connection between the deformation theory 
and quantum field theory is developed in the 
computationally intensive paper SMJ (1980). Exten- 
sive use is made of local operator product expan- 
sions, analytic continuation, and formal series 
expansions that are infinite-dimensional analogs of 
Wick-type theorems for finite-dimensional spin repre- 
sentations (developed by SMJ (1978)). One can get 
a feeling for the source of the connection by recalling 
that in one of the “exact solutions” of the two- 
dimensional Ising model the spin operators, o(a), 
are identified as elements of an infinite spin 


representation of the orthogonal group and are 
characterized by their linear action on Fermionic 
variables (Palmer and Tracy 1981). In the physics 
literature, the o(a) are referred to as Bogoliubov 
transformations. In the scaling limit the associated 
representation space is the home to a free Fermi field 
w(x), an operator-valued solution to the Dirac 
equation. Of course, v(x) has components w(x) but 
for simplicity we will suppress such details in the 
mostly schematic discussion that follows. For coin- 
cident second coordinates x? = az the Fermi field w(x) 
and g(a) satisfy the commutation relation 


c(a)v(x) = —sgn(x1 — a1)v(x)o(a) [3] 


which is a surviving remnant of the linear action of 
c(a) on lattice fermions. In the transfer matrix 
formalism, which is natural for statistical mechanics, 
translation in the “space” variable x; is unitary, but 
translation in the “time” variable, x2, is governed by 
the transfer matrix, the generator of a contractive 
semigroup. Because of this, the quantities that are 
well behaved in this formalism are “time-ordered 
vacuum expectations”; these involve only “positive” 
powers of the transfer matrix. Let 7 denote the 
“time”-ordering operator; a sequence of operators 
depending on coordinates in R? is reordered follow- 
ing 7 so that the second coordinates appear in 
increasing order from left to right. Sign changes are 
incorporated whenever it is necessary to exchange 
Fermi type operators like (x) and vy) to put them 
in the correct order. In the Euclidean setting (pure 
imaginary time) it is well known that 


G(x, y) = (TY (x)v(y)) 


is a Green function for the Dirac operator D (the 
distribution kernel for D}. 

This observation and [3] suggests that the hybrid 
vacuum expectation 


(TY (x) b(y)o(a1) : -- 


O(dn)) 
(To(a1)---o(an)) 


should be the Green function for a Dirac operator 
with a domain containing “functions” branched at 
the points a; having “monodromy” -1 there. It is 
possible to recast the SMJ analysis so that a Dirac 
operator, D(a), on a suitable vector bundle with base 
R?\{a1,...,an} becomes the central player (see 
Palmer et al. (1994) and references therein). The 
data for the vector bundle includes the factors e*™ 
incorporated in transition functions for the bundle. 
The r-function becomes an infinite determinant (or 
Pfaffian in the Ising case) 


G(x, y; a) " 


T(a) = det D(a) |4] 


in the Segal- Wilson sense (see Palmer et al. (1994) 
and references therein). The Green function 
G(x, y;a) has a finite-rank derivative, 


d,G(x, y: a) = >》 rj(x,a)s;(y,a) da; 
] 


+ uj(x, a)vj(y, a) da; [5] 


which is the key result in this version of the SMJ 
analysis (this observation appears in SMJ (1980) but 
does not have a central role there). The “wave 
functions" r, s, u, and v are closely related to the 
L^ wave functions w; described above. Equation [5] 
is both the source of the deformation equations for 
r, s, u, and v which arise from d? G — 0 coupled with 
the rotational and translational symmetries of the 
Green function, and also of the expression for 
d,logr(a) in terms of data associated with the 
deformation theory. A “transfer matrix" calculation 
of the determinant allows one to make the connection 
with the scaling limits of lattice fields including the 
Ising model (see Palmer et al. (1994) and references 
therein). 

The short-distance behavior of the two-point 
function for the Ising model scaling functions has 
been rigorously calculated by Tracy and later by 
Tracy and Widom (see Harnad and Its (2002) and 
references therein). A less detailed analysis of the 
short-distance behavior of the » point functions that 
uses the deformation analysis of the correlations in a 
crucial way can be found in Palmer (2000). 


The Riemann-Hilbert Problem 


In SMJ (1979b), a “massless” version of holonomic 
fields is developed. This concerns monodromy- 
preserving deformations of the Cauchy-Riemann 
operator 0. The techniques used to study this lead 
back to the Riemann-Hilbert problem — the problem 
of determining a linear differential equation in the 
complex plane with rational coefficients and pre- 
scribed monodromy at the poles of the coefficients. 
More specifically, suppose one is given n distinct 
points {a1,...,4a,} in P!, the Riemann sphere, and 
a base point do distinct from the ajj #0. Let y; 
denote a simple closed curve based at ao which 
winds counterclockwise once around a; but has 
winding number 0 for the other points ag,k Æj. 
Choose n invertible p x p matrices M; which satisfy 
the single condition MiM;:::M,=1. Then, the 
homotopy classes of the curves 7; are the generators 
for the fundamental group of the punctured 
sphere P!N(a,,...,a,] with base point ao and the 
map which sends 7; — M; determines a representa- 
tion of the fundamental group. One version of the 
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Riemann-Hilbert problem is to find p x p complex 
matrices A; for j=1,...,m so that the linear 
differential equation 


dy — A; 
eet Sty [6] 


has monodromy representation given by y; — Mj. 
This means that the fundamental solution Y(z) 
defined in a neighborhood of z= ao and normalized 
so the Y(a))=I (the identity) will become the 
fundamental solution Y(z)M;! after analytic con- 
tinuation around the curve ^;. This form of the 
problem does not always have a solution but when 
it does, it is interesting to consider deformations 
a — Aj(a) that preserve the monodromy multipliers 
M;. Such monodromy-preserving deformations 
were first considered by Schlesinger in 1912 (see 
Palmer and Tracy (1981) and references therein) 
and he discovered that the coefficients A; must 
satisfy nonlinear differential equations that, for 
ay — oo, can be written as 


SMJ introduced 7-functions associated with these 
deformations and they gave these r-functions a 
quantum field theory interpretation as n point 
functions. Eventually this theory was extended to 
include the Birkhoff generalization of the Riemann- 
Hilbert problem, a generalization which incorpo- 
rates the additional information needed to fix local 
holomorphic equivalence at higher-order poles (formal 
asymptotics and Stokes’ multipliers) (Jimbo and 
Miwa 1981, Sato et al. 1978). Roughly speaking, the 
problem is to reconstruct a global connection with 
specified singularities from its local holomorphic 
equivalence data and its global monodromy represen- 
tation. Thinking of the differential equation [6] as a 
holomorphic connection proved very helpful in a 
geometric reworking of the SMJ analysis given 
by Malgrange (1983a, 1983b) who showed that the 
zeros of the 7-function occurred at points where a 
suitably defined Riemann-Hilbert problem fails to 
have a solution (see also Palmer (1999) references 
within). The mathematical significance of massless 
holonomic quantum fields as (quantized) singular 
elements of a gauge group is apparent from the SMJ 
work and later work of Miwa but the possibility of 
interesting physics in these models does not seem to 
have been much investigated at this time. These 
quantum fields are also conformal fields; however, a 
comprehensive integration into the highly developed 
formalism of conformal field theory on compact 
Riemann surfaces has not currently been developed 
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(an analog of [5] should survive on compact Riemann 
surfaces but the deformation analysis of the correla- 
tions is likely limited to symmetric spaces). 


Further Developments 


This work on massless holonomic fields and the 
connection with the Riemann-Hilbert problem is 
doubtless the aspect of holonomic fields with the 
most “spin offs” in the mathematics and physics 
literature. These include an analysis of the delta- 
function gas done by Jimbo, Miwa, Mori, and Sato 
in 1981, random matrix models first looked at by 
Jimbo, Miwa, Mori, and Sato and later system- 
atically investigated by Tracy and Widom (1994), 
the deformations of line bundles on Riemann 
surfaces that led to KdV in the work of Segal and 
Wilson (1985), which emerged from work of Sato, 
Miwa, Jimbo and collaborators, the analysis of 
Painlevé equations starting with work of McCoy, 
Tracy and Wu (see Palmer and Tracy (1981) and 
references within) and more systematically devel- 
oped by Its and Novokshenov (1986), and the 
revival of interest in monodromy-preserving defor- 
mations (Harnad and Its 2002). 

Holonomic fields are related to free fields in a 
well-understood way and it is natural to study them 
in situations where free fields make sense. In 
particular, they are an interesting testbed for the 
nonperturbative investigation of the influence of 
geometry (or curvature) on quantum fields. In Palmer 
et al. (1994), the deformation analysis of 7-functions 
for holonomic fields is carried out for the Poincaré 
disk. The two-point functions are shown to be 
expressible in terms of solutions to the family of 
Painlevé VI equations. A quantum field theory 
interpretation of these 7-functions is given by 
Doyon and there are natural analogs of the scaling 
limit of the Ising model on the Poincaré disk as 
well. The role of “spacetime” symmetries in the 
deformation theory suggests that such analysis will 
be limited to symmetric spaces. In addition to the 
plane and the Poincaré disk, the cylinder, the 
sphere, and the torus round out the possibilities in 
two dimensions. Lisovyy has recently worked out 
the analysis for the cylinder, which is important for 
the study of thermodynamic correlations. It should 
be possible to recast the analysis of the continuum 
Ising model on the torus (Zuber and Itzykson 1977) 
in deformation theoretic terms. It does not appear 
that the holonomic fields associated with the Dirac 
operator for the constant curvature metric on the 
2-sphere have been studied yet. 


See also: Deformation Theory; Integrable Systems: 
Overview; Isomonodromic Deformations; 
Hiemann-Hilbert Problem; Two-Dimensional Ising 
Model. 
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Introduction 


In this article we consider the following question: 
which homeomorphisms of the circle transport one 
given class of continuous functions into another? 
The allowed classes of functions are Banach spaces 
contained in C(T), the space of continuous functions 
on the unit circle T, and will be defined by the 
properties of the Fourier series of the functions. 
Next, we will develop the theory of Poincaré- 
Denjoy which describes some basic geometric 
properties about diffeomorphisms of the circle such 
as existence and properties of the rotation number, 
classifications of possible orbits of diffeomorphisms, 
and Denjoy counterexample. 

A homeomorphism of the circle is regarded here as 
a change of variables for periodic functions. So, it will 
be our major concern to describe the changes of 
variables that do not affect “too much" the behavior 
of the Fourier series of the functions in the given class. 

We say that a function 5 : R > R is a homeomorph- 
ism of the circle T={(x,y) € R?:x?- y^ 1], 
if h itself is a homeomorphism such that h(t + 27) = 
h(t) + 27 for all t € R. It is clear that such function b 
induces a unique homeomorphism 5: T— T that 
makes the following diagram commutative: 


~ 


h 
T—T 和 一 ' 
e” | e, ie, b(e) = el”) 
R— R 
h 


In the same way, we identify functions v: T — C 
with 27-periodic functions v: Rr C. 

Let U(T) be the space of all continuous functions 
on T that have uniformly convergent Fourier series, 
and let A(T) be the space of all continuous functions 
on T with absolute convergent Fourier series. 

In 1953, Beurling and Helson proved an important 
result about the homeomorphisms that preserve the 
space A(T): they are rotations and symmetries, that 
is, if f o b € A(T) for all f € A(T), then the homeo- 
morphism 4 must have the form h(t)=t+a or 
b(t) — —t +a. It is quite obvious that rotations and 
symmetries preserve A(T), since the Fourier coeffi- 
cients of f o b and f have the same modulus, but to 
prove the converse is very hard. So, homeomorphisms 
that preserve .A(T) are a very restrict class. 


-|p^(t)] + |b (t)| +--+ + 4b? (0| 0, 


A wider class is obtained when we transport A(T) 
into U(T), that is, f o h € U(T) for all f € A(T). The 
major object of this article is to study such changes 
of variables. 

We say that a homeomorphism of the circle / is of 
finite type, if there is an integer v, satisfying 3 < v < oc, 
such that / is of class C" and 


for allZcR 


In the realm of Fourier analysis, the most 
important and general result about homeomorph- 
isms of the circle is due to R Kaufman, who showed 
in 1974 that a finite-type homeomorphism / 
transports A(T) into A(T). We shall analyze in 
detail such seminal result. 


Homeomorphism of the Circle 
of Finite Type 


In this section we prove the theorem of R Kaufman 
mentioned before, which means that it is sufficient 
for a homeomorphism of the circle h to have a 
certain amount of curvature in order to transport 
A(T) into U(T). We present a simple proof of this 
fact, based on a result due to Stein and Wainger. 

If f :/'T — C is a continuous function and if 


1 ] —int 
f= Ax Pis. dt, nez 


denote the Fourier coefficients of f, then f € .A(T) if 
and only if 


N 
Yl lim Yl eoo 
ncz, OS SN 


Of course, A(T) is a Banach space with the norm 


IF acr - >, A 


neN 


The space U(T) is defined as the space of all 
continuous functions f : T — C such that 


N 
> fhe™ — f(t), when Noo, for all t € [-r,n] 
-N 


uniformly on T, that is, U(T) is the space of 
continuous functions from T to C that are the 
uniform limit of their Fourier partial sums 


N ^ / 
Sx(f, t) = » fe" 
-N 
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Hence, under the natural norm given by 


alten r sup{|Sn(f, t)|: NeN= {0, 1, af ef 
and t € [—7, 7|} 


the space U(T) is a Banach space. 


We shall prove: 


Theorem 1 (Kaufman 1974). Let b be a bomeo- 
morphism of tbe circle of class C", with v > 3. 
Suppose that 


|^ (0| + |b (0| +++ + |b (t)| 0, 


Then, b transports A(T) into U(T), that is, f ob € 
U(T), whenever f € A(T). 


for all t € R 


It follows from the theorem that an analytic 
homeomorphism of the circle transports A(T) into 
U(T). To see this, suppose that / is not of finite type. 
Then, for each n > 3, there exists t, € [—7,7] such 
that 5U (£,) —0 for all j € {2,...,m}. Since {t,} has a 
convergent subsequence, there exists t€ [—7,7] 
such that h” (r) —0 for all j > 2. This implies that 
b" must be a constant function and, therefore, 
h(t)=+t+a. Since we know that this kind of 
homeomorphism preserves A(T), we are done. 

One can ask why to demand v > 3. The answer is 
easy. Since h(t+27)=h(t) 427 for all £ € R, it 
follows that b'(t + 27) — P'(t) for all t € R, that is, p’ 
is a periodic function of period 27. So, it will always 
exist a point t € (7,7) such that h”(t) — 0. 

We can also infer from the theorem that a C% 
homeomorphism of the circle that has no flat point, 
that is, no point t such that 5) (t) — 0 for all j > 2, 
transports A(T) into U/(T). This is obvious, because 
the negation of being of finite type implies the 
existence of a flat point. It is not true, however, that 
every C® homeomorphism of the circle transports 
A(T) into U(T). 

The proof of the theorem is based on the two 
lemmas that follows. The first lemma was obtained 
by Stein and Wainger, who proved it in a more 
general setting in 1965, although that proof was 
only published five years later. The second lemma 
was proved by R Kaufman in 1974. 


Lemma 2 (Stein and Wainger 1970). 
real polynomial of degree d. Then 


[t 
—r 


Let p(t) be a 


< 6(22**) — 2a — 10 


for all r > 0. 


Lemma 3 (Kaufman 1974). Let f be a real function 
of class C* on the interval [—r,r], with k > 2. 
Suppose that 1 < |f (t) <b for all t €[—r,r]. 


Then 
f if dt dt 
i t 


where C(k,b) is a constant that depends only on k 
and b. 


< C(k,b) 


We shall see that Lemma 3 can be proved from 
Lemma 2 in a quite simple way. The proof given 
by R Kaufman for Lemma 3 does not make use of 
Lemma 2 at all. Also, it is not difficult to see that 
Lemma 2 follows from Lemma 3, if we consider 
d > 2. So, they are indeed equivalent results. 

Before getting into the proof of these two lemmas, 
let us state a result which is the primary tool in 
dealing with oscillatory integrals as those in the 
lemmas. 


Lemma 4 (Van der Corput lemma). Let f be real 
valued and smooth in [a,b], with O0O<a<pb. 
Suppose that |f'9(t) » A0 for all t€ [a,b]. 


Then 
/ ” ro) dt 
i t 
holds if 


(i) k>2, and 
(ii) k— 1 and f'(t) 


Now, let 
Theorem 1. 


入 一 1 


< [3(2") - 2]— 


is monotonic. 


us prove the two lemmas and 


Proof of Lemma 2 The proof is by induction on 
the degree of the polynomial. Suppose that p(t) is a 
polynomial of degree 0, that is, p(t) is a constant 
function. In this case the result is trivial, since the 
integral is equal to zero. 

By induction, assume that the statement is true 
for polynomials of degree less than or equal to d. 
Let 

bit) = 24,477! 十 a ji? Tre 


十 Cit 十 00， 444 #9 


Make the change of variables t= ag; |" EE, 


Then we have 


[ c= f ema (t)i) dż 
t t Ja E 
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where o= laga [0 Hs and g(t) is a polynomial To evaluate III, we proceed as following: 
of degree at most equal to d. Suppose o » 1. | 
(k-1) (0 (D (ot 
Then f(t) — f(0) - (0) - - att +f E gh 
í i(q(t)3-t*!) I í iaa) dt > 
f: n -p(t) +f (ont) 
十 [ eila(t)tt**") 了 where the number o; depends on t and 0< o; < 1. So 
| 
t e di 
B. a di if(t) ^" 
+f em 一 "n P 
it 1 1 
< E- II 4- Ill = / ud (orth) _ eip «| T + / gro ct 
-1 —1 
By Van der Corput lemma, I < [3(24*!) — 2] and 2 k 
II < [3(24*1) — 2], so I + II < 6(24*!) — 4. Now < by + 627) - 2(k = 1) — 10 


by Lemma 2, since p(t) is a polynomial of degree at 
most equal to k — 1. 
On the other hand, if r € 1, it also follows from 


1 
III < f [| ()ir*9) — gigi) E dt «f iato dt 
li t 


< L ltl? de + 6(24*!) — 24 — 10 Lemma 2 that 
一 人 , 
Z3 4 6290. 34. 10 [2 eif E 
since the degree of q(t) is at most equal to d. So «Ifi [; ose (ont _ " o) yj dt] + [ " 3 
[++I : 2 2 
« 6(24+1) — 4424 6(24*!) — 24 — 10 < b= + 6(2*) — 2(k — 1) — 10 
= 6(24*!) — 2(d 4- 1) — 10 
On the other hand, if o < 1, then Hence 
" ” adt 
人 eias dt / eo < C(k, b) 
= t - 
á [ [some i eno dt " f eiat) dt and we are done. 口 
^ M-e t E t 
< 2-- 6(22*) — 243 — 10 Proof of Theorem 1 Let h be a homeomorphism 
« 24- 6(24*) — 2d — 10 4- 6(24*1) — 4 of the circle satisfying the hypotheses of the 
ri theorem. 
< 6(24*!) — 2(d +1) — 10 We claim: there exists 6 > 0 such that, for all x € 
; [7,7], there is k depending on x, with 2 < k € v, 
andthe proot 1 womploted. A such that Ib € (t + x)| > 6 for all t satisfying |t| < 6. 
The proof of the claim is simple: suppose that 
Proof of Lemma 3 Assume first that. > 1. Then there is no such 6. Then, for each n € N and each k 
with 2 < k < n, there exist x, € [—7,7] and £,, such 
f oif) dt that |t,.| < 1/n and |h'*)(t,, +x,)| < 1/n. Taking a 
- t subsequence if necessary, we have x, — x € [—7,7]. 
r dt r dt L dt Also, t4, 一 0 when n—oo for all such k. So, 
< / eif c? T | le + J ef " p! (ti, + Xn) — b? (x) when 1 — oo. Since |h'*)(t,, + 
; = Xn)| < 1/n, we conclude that h'*)(x)=0 for all k 
=I +I +III 


with 2 < k < v, thus reaching a contradiction. 
Since |f/?(t)| 2 1 and k>2, then by Van der . Now, let fe A(T). So, J 5 lfnl «oo, thus 

Corput lemma, I < [32^) - 2] and I1 < [3(2*) — 2]. — implying that 

(Note that we have to assume k > 2 in order to oo0 NX 

apply Van der Corput lemma, since we know f (t) = » fue = fit »- fet 

nothing about the monotonicity of f'(t).) —oc -N 
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Hence 


co N | j 
= inb(t) — 1: inh(t) 
= 2, Ja cpm 3 fne 


Put gu(t) - 32h X f, e") Since gw is smooth, 
we have gn €U(T) for all N € IN. If g(t) stands for 
f(b(t)), then gn — g uniformly, since f € A(T). Thus, 
it suffices to prove that g € U(T). This happens if 
and only if $,(g,x) - 3; _m ĝe" converges uni- 
formly to g as 7— oo, that is, given e > 0, there 
exists mo € IN such that |$,,(g, x) — g(x)| < e for all 
m > my and x € [7,7]. 


We have 
[Sm(g, x) — g(x)| 
€ |gu(x) — g(x)| + |Sm(gN, x) — gn(x)| 
+ |S5(gu.x) — Su(g.x)| 


for all m,n € N. Since gu —> g uniformly and gn € 
U(T), the last inequality shows that we need to 
demonstrate that, for each c» 0, there exists 
No € N such that 


Sm (SN) — Sm(8,x)| < € 


VN>No,x€[-7,7] and meN 


thus proving that S,,(gu, x) > S,(g, 
and m when N 一 oc. 
But, if K » N € N, we have 


IS (gu. x) — Sm(gK, X)| 
| (enle +) - mt t+) Dn dt 


K 
inb(t--x) / eith(t+x) 
TA (aie E ) 


n=—K 
Dm (t) dt 


1 f" id. 
I | Y jm ) Dal) di 
^" \ K>|n|>N 


Ta 3 faf eith(t+x) p D,(t) dt 


K7|n|2N 
= ~^ gg Sin(m + (1/2))t 
Du) = 2, S7 sme 


x) uniformly in x 


IA 


where 


k=—m 


is the Dirichlet kernel. 
Hence, we are done if we show that 


/ cme (| < C [1] 


where C is a constant that does not depend on m, n, 
and x. 

To prove that the oscillatory integral above is 
bounded, we make use of Lemma 3. We have that 


2 sin(mt) 


Dy» (t) = t 


+ O(1) 


on any compact subset of ( —27, 27), that is, 
sin(m + (1/2))t 2sin(mt) 
sin(t/2) t 
t cos(t/2) — 2 sin(t/2) 
t sin(t/2) 


EET 


where the constant C* does not depend on m, on 
any compact subset of (—27, 27). 

In order to prove [1], consider x € [—7,7]. We 
have already proved that there exists k (depending 
on x), with 2 < k < v, such that |b (t + x)| > 6 > 0 
for all t such that |t| < 6. Therefore, 


. ó . 
| J © ainb(t+x) Enn dt / einh(t+x) sin(mt) di 
- -ó 


: t 
+ 2 log (=) 


We can assume that is a positive integer: if is 
negative, we take complex conjugate; and if 7 — 0, 
the integral is trivially bounded, as we see by 
integration by parts or by Van der Corput lemma. 
(Indeed, we do not need to worry about n — 0, since 
it is necessary to bound the integral only for large z.) 

So, assuming that m is a positive integer, we 
change variables: define t — rs, where r — n !/^6-U/*, 
Since sin(mt) = (e — e") /(27), we have 


n inb(t4-x) = ir 
—6 


ó 
fe ilah(t+x)+mt] $a J eilnb (tx) )—mt] ~ dt 
4T t 


d 
/ ir ei lth(rs+x) +mrs| ^ ds 
E le S 


IA 


.6 
| ”nlm ds 


—őjr S 


十 


Put ó(t) =nh(rt+ x) 4- mrt and w(t)=nbl(rt HN x)— 
mrt. We have $9 (t )=nr h (rt + x) and w*)(t) = 
nr’ b (rt + x). But, since nrt = 1/6, we conclude that 


je (| = |p" (e| 
1 a) 80 
= 了 所 (rt -- x)| 1 vee | = H 
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Also, 
oP E) = |p (t)| < b, 
1 
az Cres TE 
= smax{ |h (s)|: -21 < s € 2r) 
for all t € [—6/r,6/r]. Therefore, by Lemma 3, we get 


f e ¿(p dt 
—é/r t 


< C(k, bg) 


< max{C( j} b): 2<j<v} 


and 
f ei") = < C(k, b,) 
—éjr t 
< max{C(j,b;): 2«j € v) 
This concludes the proof. 口 


Diffeomorphisms of the Circle 


In this section we study the circle diffeomorph- 
isms. This theory goes back to Poincaré (1885), 
who studied circle diffeomorphisms to decide 
when differential equations on the torus have 
periodic orbits of a specified type. For this he 
introduced the rotation number as an important 
dynamical invariant, which later turned out to be 
very fruitful in the theory of dynamical systems, 
and proved that a diffeomorphism with an 
irrational rotation number is combinatorially 
equivalent to a rotation with the same rotation 
number. 

Denjoy (1932) constructed examples of diffeo- 
morphisms of class C! with irrational rotation 
number having wandering intervals, in opposition 
to early ideas of Poincaré. It was necessary to 
assume that a diffeomorphism without periodic 
points is more smooth, in fact C*, to prove that it 
is topologically conjugate to the rotation. 


The Poincaré Rotation Number 


Let h:T—T be an orientation-preserving homeo- 
morphism. Given such a map, there is a (nonunique) 
map b:R—R, which is called a lift of 5, such 
that hop=poh, where p: R —e T is covering map 
p(t) = e77", 

A lift, b, of h satisfies: 


1. h is monotonically increasing, that is, b(t1) < 
hit) if ti < 5. 

2. b(t +1)=h(t)+1 for all t € R, so (b — id) has 
Nw 1. 
. If by hz are two lifts of b, then there is an integer 
k such that h(t) — hi(t +k for all t € R. 


These conditions immediately yield the following: 
the transformation b :— b o --- o h is monotonically 
increasing and — PA(t - r) 2 /^(t) - r,t € R,k € N, 
r EZ. 

The rotation number gives an asymptotic indica- 
tion (ie., in the limit) of the average amount of 
rotation of a point along an orbit. We start by 
defining, for a lift b of b, the number 


b^(t) —t 
tht) = jim P^ 
This limit exists and does not depend on the 
choice of the point ? € R; so, we denote it by 
polh). If hy b» are two lifts of h, then polhi, t) — 
po(h2,t) is an integer, so 


p(b) := polh, t)mod 1 


is well defined. The number p(^) € [0, 1) is called the 
rotation number of h, and depends continuously on 
b. For detailed proof, see Katok and Hasselblatt 
(1995) or Robinson (1999). 


Theorem 5 Tbe rotation number ab) is rational if 
and only if h bas a periodic point, this is, there exist 
zo € S! and REN such that h*(z) — zo. 


Proof Take a lift h of 5 such that h(0) € [0, 1). 
Suppose that p(h) — q/m. 

If b has no fixed point. Then h(t) —t € R \ Z for 
all t € R, since b(t) — t € Z implies that p(t) is a 
point fixed for b. In particular, b(t)—tAq for all 
t € R, since 5 —id is continuous and periodic, there 
exist real numbers a > 0 such that h(t) ^ t < q—a for 
all t € R. Then 


p^ (g) — pm (gp) 
= pipa (p — [h-e] 
«q—a, VkREN 
y 
b -a 
= (p [p^ o" (e| — p^?" (9 


+ (p (pe —2) inse (t)] = [h\-2)™ (¢)]} 
+ {h™b& 9 (t)] — [be (2 + 
+ UP (t) — t} < k(q — a) 
So 


proving the claim by contraposition. 
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To see the converse, assume that there exists a 
periodic point to € R, that is, there are »,q € Z 
such that h” (to) — to + q then 


b^" (to) = to + kq 


h”k (to) — fo z q 

mk m 
Corollary 6 A homeomorphism h:T—T does not 
bave periodic points if and only if tbe rotation 
number p(b) is irrational. 


Let Ry be defined on T by Ry(e?™) =e?" 
This map is called a rigid rotation of angle A and it 
is easy to see that 4)(t)=t-+ A is lift of Ry and that 
p(R3) = p(hy) 三 和 mod 1. 

In this example we can see the connection 
between the rationality of the rotation number and 
the existence of a periodic orbit. Assume A =m/q is 
rational. Then h%(t)=t+qA=t+m. Therefore, 
every point is periodic with period q. Now, assume 
that A is irrational. Since P5(f) — t 4- nÀ for all n, 
then Ry, has no periodic points. In this case, show 
that every point in T has a dense orbit. 

Now, again let b: T—' T be any orientation- 
preserving homeomorphism. 


> p(h) = lim 口 


Lemma 7 If the rotation number of h is rational, 
then all periodic orbits have the same period. 


Proof If p(h)=m/q with m,q € Z relatively prime, 
then we need to show that for any periodic point 
zo—p(t) (where p(t)=e?™ is a covering space 
projection of T) there is a lift h of h such that h(0) € 
[0, 1) for which h(t) =t + m. If zo is periodic point, 
then P'(t) — t -- s for some r,s € Z and 

= pil] = lim i n WS? Vim es 
q noo — Hf nr r 
So that s=km and r — kq. Then by monotonicity of 
b, we have that b?(t) — t -- m as claimed. 口 


The Poincare Denjoy Theory 


A homeomorphism of the circle with rational 
rotation number has all its orbits asymptotic to 
periodic ones and this, together with Theorem 5, 
yields a complete classifications of the possible 
asymptotic behavior when the rotation number is 
rational. This motivates the study of the asymptotic 
behavior of orbits of homeomorphisms with irra- 
tional rotation number. 7 

The w-limit set of a point zo € T with respect to h 
is the set w(zo)={z € T; b” (zo) - z as n, — oo, for 
same sequence {ng}p_ 1}. The a-limit set a(zo) of an 
arbitrary point zo € T is defined similarly (with 
ny 一 一 oo instead np 一 +00). 


Any orbit of a rotation R, with irrational A is 
dense in T, that is, w(zo) — o(zo) =T for all zo € T. 


Theorem 8 (Poincaré 1885). Let b:T—'T be an 
orientation-preserving homeomorphism with irra- 
tional rotation number. Then the w-limit set is 
independent of x and is either T or perfect and 
nowhere dense. 


The preceding proposition says that maps with 
irrational rotation number have either all orbits 
dense or all orbits asymptotic to a Cantor set. 

We say that two maps f,g: T — T are topologi- 
cally conjugate if there exists a homeomorphism 
h:T—T such that bof —gob. This implies that 
bof"—g"obh for every integer n. Hence, the 
conjugacy h maps orbits of f into orbits of g. If a 
monotone map /:'l'— T satisfies lo f —gol but is 
not a necessarily homeomorphism, we only have 
that inverse image of each point is either a point or a 
closed interval. We say that / is a semiconjugacy 
between f and g; this case / maps orbits or pack of 
orbits of f into orbits of g. 


Theorem 9 (Denjoy 1932). Let f:T— T be an 
orientation-preserving diffeomorphism of class C^, 
with irrational rotation number (p(f) — A). Then f is 
topologically conjugate to the rigid rotation Ry. 


Note that in spite of the hypothesis of f being C’, 
we obtain only a continuous conjugacy. It took 
almost 50 years until Michael Herman (1979) was 
able to solve the more difficult problem of obtaining 
a smooth conjugacy for rotation number satisfying 
extra_arithmetic conditions. 

If f is a circle homeomorphism which does not 
have periodic points, then there exists a semicon- 
jugacy b between f and a rotation R,. If þ is not a 
conjugacy, then there exists a point x of the circle 
whose inverse image by h is an interval J. Since 
bof—Ryob, we have that h(f (J) —R"(x). It 
follows that the intervals of the family 
{J,f(J),f7(J), ---} are pairwise disjoint, and the 
w-limit set of / does not reduce to a periodic orbit. 
We say that / is a wandering interval of the map f. 
Thus, C?-differentiability implies that f does not 
have a wandering interval. For details of the proof 
of Theorem 9, see Melo and Strien (1993). 


The Denjoy Example 


Denjoy also proved the following result, which 
shows that the hypothesis of class C? is essential. 


Theorem 10 (Denjoy 1932). For any irrational 
number ^ € 0,1), there exists a C'-circle diffeo- 
morphism f which bas a wandering interval, and 
rotation number equal to A. 
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Proof The construction of a diffeomorphism with 
wandering interval will be done in the following 
manner. Given an irrational rotation R,(e?"") = 
e?"(*). cut the circle T at all the points of an orbit 
{zn = R'(e?7'^); n € Z} of Ry. In each cut insert a 
segment J, of length |, where $77. 1, —1. We 
obtain in this manner a new circle longer than the 
first. The open intervals correspond to the gaps of 
the Cantor set. 

In order to construct f formally. Let l, be a 
sequence of positive real numbers with neZ 
satisfying 


(1) lim, ,oo (Lili) =1 
(ii). v nl ol 
(iii) |, > 1.41 fora > 0 
(iv) la < l1 for n < 0 and 
(v) 3,44 —1, > 0 for n > 0 


For example 
In = T (|n| - 2) "(Inl +3) 


where 


T^ — 5, (inl +2) +3 
n=—00 

Let J, be a closed interval of length l„. We place 
these intervals on the circle in the same order as the 
order of the orbit R%(0). So to place an interval J,, 
consider the sum of the lengths of the intervals J; 
where R4(0) is between R%(0) and 0. This deter- 
mines the placement of J,. 

The next step is to define f on the union of the J,. 
It is necessary and sufficient for f'(t) 2 1 on the 
endpoint in order for the map to have a continuous 
derivative when it is extended to the closure. 
Assume J, = [a,, bn], so ln = b, — ay. The integral 


ON [3x 


b, 
J (b, — t)(t — a,)dt = 
SO 


= b, 
I (by — $)(t — Gy) dt = Lis — ty 


Therefore, if we define f for x € J, by 


f (x) = An+1 


+f I 4S = bh) -h) (b, — t)(t — an) |dt 


I, 


then f(b,)—-a441-- Il -l41— ln = b,,4. Also, f is 
differentiable on J,, with 


Thus, f'(a,) = 1 = f'(b,). Notice that for n < 0, 1,41 一 
|, > 0, that 


/ 6(L os. ln) la A d | ae i ly 
LSF SliT— (*) nghe 
and (3L,;1 — L,)/(21,) goes to 1 as n —^ —oo. Simi- 
larly form > 0 and x € Jy, 


3l R SE Ine 
2l, 


so f'(x) goes to 1 as 1 — +00 uniformly for x € Jn. 
From these facts, it follows that f is uniformly C! on 
the union of the interiors of the J, and has a C! 
extension to all of T. 

Let A— 7T\Unezint(J,). This is a Cantor set. The 
orbit of a point x € A is dense in A since it is like the 
orbit of 0 for Ry. Thus, w(x) = A. If x € int(J,,), then 
there is a smaller interval I whose closure is 
contained in int(j,). Since the interval J, never 
returns to J, but wanders among the other Jp, then 
Jn is a wandering interval. o 


12 pfta) = > 0 


Further Results 


In this section we shall state some additional results 
about homeomorphisms of the circle in the area of 
Fourier analysis. 

The first result is a theorem of Pal (1914) and 
Bohr (1935): let f: T—R be a real continuous 
function; then, there exists a homeomorphism of the 
circle h such that f o h € U(T). The best proof of this 
theorem is due to, Salem (1945). In 1978, Kahane 
and Katznelson showed that the result is still valid 
for f : T — C continuous. 

A similar question was posed by Lusin: given a 
continuous function f:T— R, is there a home- 
omorphism of the circle b such that f op € A(T)? 
The problem remained open until 1981, when 
Olevskii, Kahane, and Katznelson answered nega- 
tively the question: there exists a real (or complex) 
continuous function f on the circle, such that, for all 
homeomorphism of the circle h, f o h ¢ A(T). 

It was proved by the author that there are C* 
homeomorphisms of the circle, not necessarily of 
finite type, that transport A(T) into U/(T). It is a very 
technical work, published in 1998, and it gives a 
necessary and sufficient condition for a homeo- 
morphism of the circle with a flat point to transport 
A(T) into U(T). 

Finally, the Denjoy theorem (Theorem 9) is rather 
close to being optimal. The example constructed here 
can be improved by obtaining a circle diffeomorphism 
whose first derivatives have Hélder exponent arbitrarily 
close to 1 (see Katok and Hasselblatt (1995)). Recent 
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work has dealt with the existence of a differentiable 
conjugacy between a diffeomorphism f with irrational 
rotation number A and Ry. Arnol, Moser, and Herman 
have obtained results (see Melo and Strien (1993) for a 
discussion of this results and references). 
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Roughly speaking, a homoclinic orbit is an orbit 
of a mapping or differential equation which is both 
forward and backward asymptotic to a periodic 
orbit which satisfies a certain nondegeneracy condi- 
tion called “hyperbolicity.” On its own, such an 
orbit is only of mild interest. However, these orbits 
induce quite interesting structures among nearby 
orbits, and this latter fact is responsible for the main 
importance of homoclinic orbits. In addition, when 
homoclinic orbits are created in a parametrized 
system, many interesting and unexpected phenom- 
ena arise. 

In this article, we first describe the history and 
basic properties of homoclinic orbits. Next, we 
consider some simple polynomial diffeomorphisms 
of the plane (the so-called Hénon family) which 
exhibit homoclinic orbits. Subsequently, we discuss 
a general theorem due to Katok which gives 
sufficient conditions for the existence of such 
orbits. Finally, we briefly consider issues related to 
homoclinic bifurcations and some of their 
consequences. 


Homoclinic Orbits in Diffeomorphisms 


Consider a discrete dynamical system given by a C" 
diffeomorphism f : M — M where M is a C™ mani- 
fold and r is a positive integer. That is, f is bijective 
and both f and f^ are r-times continuously 
differentiable. Given a point x € M, set xo — x. For 
non-negative integers n we inductively define 
Xati =f (xa) and x»-1=f (xa) We also write 
f"(x)—x, for n in the set Z of all integers. The 
“orbit” of x is the set O(x) = (f"(x): n € Z}. 

A “periodic point" p of f is a point such that there 
is a positive integer N > 0 such that f"(p) — p. The 
least such number 7(p) is called the *period" of p. If 
T(p)—1, we call p a “fixed point." The periodic 
point p with period 7 is called called “hyperbolic” if 
all eigenvalues of the derivative Df'(p) at p have 
absolute value different from 1. For convenience, we 
refer to the eigenvalues of Df'(p) as eigenvalues 
associated to p. If p is a hyperbolic periodic point all 
of whose associated eigenvalues have norm less than 
one, we call p a “sink” or “attracting periodic 
point." The opposite case in which all associated 
eigenvalues have norm larger than one is called a 
“source.” A hyperbolic periodic point p which is 
neither a source nor a sink is called a “saddle” or 
“hyperbolic saddle.” 

Given a saddle p of period 7, we consider the set 
W*(p) — W*(p,f) of points y € M which are forward 
asymptotic to p under the iterates f^. That is, the 
points y € M such that f" (y) — p as n —^ oc. This is 
called the “stable set" of p. Similarly, we consider 
the “unstable set" of p which we may define as 
W"(p) = W"(p,f)- W*(p,f ). The stable manifold 
theorem guarantees that W‘(p) and W"(p) are 
injectively immersed submanifolds of M whose 
dimensions add up to dim M. In these cases, they 
are called the stable and unstable manifolds of p, 
respectively. A point q € W*(p) N W"(p)N {p} is called 
a *homoclinic point" of p (or of the pair (f, p)). If the 
submanifolds W*(p) and W"(p) meet transversely at q, 
then 9 is called a “transverse homoclinic point.” 
Otherwise, g is called a *homoclinic tangency.” 

In the special case when M is a two-dimensional 
manifold, the stable and unstable manifolds of a 
saddle periodic point p are injectively immersed 
curves in M. A transverse homoclinic point q of p is 
a point of intersection off p where the curves are not 
tangent to each other. This is depicted in Figure 1 
for the case of a saddle fixed point for the map 
H(x,y) =(7—x*—y,x), a member of the so-called 
Hénon family, which we will discuss later. The 
figure was made using the numerical package 
*Dynamics" which comes with the book by Nusse 
and Yorke (1998). 
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d W“(p) 


Figure 1 Stable and unstable manifolds in the map 
H(x, y) =(7 — x? — y, x) for the fixed point p ~ (—3.83, —3.83). 


One easily sees that every point in the orbit of 
a transverse homoclinic point q of a hyperbolic 
saddle fixed point p is again a transverse homoclinic 
point of p. Also, the curves W"(p) and W*(p) are 
invariant; that is, /( W"(p)) = W"(p) and f(W*(p)) = 
W*(p). This implies that the curves W"(p) and W*(p) 
extend, wind around, and accumulate on each other 
forming a complicated web. 

Upon seeing this complicated structure in the 
restricted three-body problem, Poincaré very poeti- 
cally wrote (p. 389, Poincaré 1987) 


Que l'on cherche à se représenter la figure formée par 
ces deux courbes et leurs intersections en nombre infini 
dont chacune correspond à une solution doublement 
asymptotique, ces intersections forment une sorte de 
treillis, de tissu, de réseau à mailles infiniment serrées; 
chacune des deux courbes ne doit jamais se recouper 
elle-méme, mais elle doit se replier sur elle-méme d'une 
maniére trés complexe pour venir recouper une infinité 
de fois toutes les mailles du réseau. 

On sera frappé de la complexité de cette figure, que je 
ne cherche méme pas à tracer. Rien n'est plus propre à 
nous donner une idée de la complication du probléme 
des trois corps et en général de tous les problèmes de 
Dynamique oü il n'y a pas d'intégrale uniforme ... 


The next major advance concerning homoclinic 
orbits was made by Birkhoff (1960), who proved 
that in every neighborhood of a transverse 
homoclinic point of a surface diffeomorphism, 
one can find infinitely many distinct periodic 
points. Birkhoff also presented a symbolic 
description of the nearby orbits and noticed the 
analogy with Hadamard's description of geodesics 
on a surface. Birkhoff's analysis was generalized 
by Smale to arbitrary dimension, and, in addition, 
Smale gave a simpler analysis of the associated 
nearby orbits in terms of compact zero-dimensional 
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symbolic spaces which we now call “shift spaces” 
r “topological Markov chains." 

Once one knows that a diffeomorphism f has a 
transverse homoclinic point for a saddle periodic 
point p, it is interesting to consider the closure of the 
orbits of all such homoclinic points. This turns out 
to be a closed invariant set containing a dense orbit 
and a countable dense set of periodic saddle points 
(Newhouse 1980). It is usually called a *homoclinic 
closure” or h-closure. These sets form the basis of 
chaotic or irregular motions in nonlinear systems. 


The Smale Horseshoe Map and 
Associated Symbolic System 


To understand the geometric picture discovered by 
Smale, it is best to start with a concrete example of a 
diffeomorphism of the plane known as the *Smale 
horseshoe diffeomorphism." 

Given any homeomorphism f : X — X on a space 
X and a subset U c X, let us define I(f, U) to be the 
set of points x € X such that f"(x)&U for every 
integer n. Thus, we have 

| ]f'(U 


I(f, U) — 
ncz 


We call I(f, U) the invariant set of f in U, or, 
alternatively, the invariant set of the pair (f, U). 

We now construct a special diffeomorphism f of 
the Euclidean plane to itself in which U — O is the 
unit square and for which I(f,U) has a very 
interesting structure. It is this map which is usually 
known as the Smale horseshoe map. 

Let O=[0,1] x [0,1] be the unit square in the 
plane R^. Let 0 < o < 1/2, and consider a diffeo- 
morphism f : R? — R? which is a composition of two 
diffeomorphisms f = T;o T; as follows. The map 
Ti(x,y)— (a !x,oy) contracts vertically, expands 
horizontally, and maps O to the thin rectangle 
Oi ={(x,y):0 <x <at,0 <y <a} which is short 
and wide. The map T» bends the right side of OQ1 up 
and around so that T>(Q,) =f(Q) has the shape of a 
“horseshoe” or “rotated arch." We arrange for T» to 
take the lower-right corner of OQ1 up to the upper-left 
corner of O in such a way that f(O) meets O in two 
full width subrectangles which we call Ri and R2. 
This can be done in such a way that the preimages 
R4! — Tj! (R1) and R3 = Tj! (Tj! (R5)) are both full- 
height subrectangles of O, and the restricted maps 
f; def f | Ry! and fo def f | R5! are both affine. Thus, we 
arrange that fı is simply the restriction of T, to Rj!, 
and the map f» can be expressed in formulas as 
hx,y)-(-ax--a, —ay+1). This construc- 
tion implies that f will have the origin p — (0,0) as a 


R» KQ) 


R, 


p 
Figure 2 The horseshoe map. 


hyperbolic fixed point. We label the upper-left corner 
(0, 1) of O with the letter q. It follows that the bottom 
and left edges of O will be in the unstable and stable 
manifolds of p, respectively, and we have indicated 
this in Figure 2 with small arrows. 

The above construction gives us a em cm 
f of the plane R^ such that Ott f(Q)NQ= 
Rı R3 is the union of two full-width subrectangles 
of O. We wish to describe I(f, O). We begin with 
the sets OT = (,sof"(Q) and O^ = («of "(O). 
Thus, O* is simply the set of points in O whose 
backward orbits stay in Q, and O° is the set of 
points whose forward orbits stay in QO. For i= 1,2, 
each rectangle R; is mapped to a thin horseshoe in 
f(Q) which meets O in two full-width subrectangles. 
Combining these for i=1,2 gives four full-width 
rectangles as shaded in Figure 3. Thus, 
Of(Q)Mf7(Q) consists of these four subrectan- 
gles. Figure 3 shows the sets f^(Q), f-*(Q) as well as 
the shaded rectangles we just mentioned. 

Continuing in this way, one sees that, for each 


n > 0, the set O7 — O(|f(O)f) ... (|f"(Q) consists 
of 2" full-width subrectangles of O, each with height 


TIT 


TD 


Figure 3 The sets f*(Q) and f?(Q) for the horseshoe map f. 


a”. It follows that Ot = (),f”(QO) 
times a Cantor set. Analogously, O^ is a Cantor set 
times an interval, and the set I(f, O) is a Cantor set 
in the plane. Let us recall the definition of a Cantor 
set C in a metric space X. We first define a Cantor 
space C to be a compact, perfect, totally discon- 
nected metric space. That is, C is a compact metric 
space, whose connected components are points such 
that every point x in C is a limit point of C^ {x}. A 
Cantor set C in a metric space X is a subset which is 
a Cantor space in the induced subspace (relative) 
topology. 

The dynamics of f on the invariant set I(f, Q) can 
be conveniently described as follows. 

Let X;—(1,2]^ be the set of doubly infinite 
sequences of 1’s and 2's. Writing elements a € X; 
as a = (aj) = (aj);c z; we define a metric p on X» by 


D 2In jai — 


neZ 


is an interval 


The pair (£2, p), then, is a Cantor space. 

The “left-shift automorphism” on X; is the map 
o: X5, defined by o(a);=aj,; for each i€Z. 
This is a homeomorphism from 7; to itself. It has a 
dense orbit and a dense set of periodic points. 

For a point x € I(f, QO), define an element ó(x) = 

= (aj) € X» by aj — j if and only if f'(x) € Rj. It turns 
out that the map ó:I(f, O) —^ X? is a homeomorph- 
ism such that of = of. 

In general, given two discrete dynamical systems 
f:X—xX, and g:Y—Y, a homeomorphism 
b:X-— Y such that gb — bf is called a topological 
conjugacy from the pair (f, X) to the pair (g, Y). 
When such a conjugacy exists, the two systems have 
virtually the same dynamical properties. 

In the present case, one sees that the dynamics of f 
on I(f,Q) is completely described by that of o 
on $5. 

It turns out the the Smale horseshoe map contains 
essentially all of the geometry- necessary to describe 
the orbit structures near homoclinic orbits. To begin 
to see this, recall that the left and bottom boundaries 
of O were in the stable and unstable manifolds of p. 
Extending these curves as in Figure 4, one sees that 
the three corners of O different from p are, in fact, 
all transverse homoclinic points of p. 

It was a great discovery of Smale that, in the case 
of a general transverse homoclinic point, one sees 
the above geometric structure after taking some 
power f^ of the diffeomorphism f. Thus, we have 


Theorem 1 (Smale). Let f:M — M be a C! diffeo- 
morphism of a manifold M with a hyperbolic 
periodic point p and a transverse homoclinic point 
q of the pair (f, p). Then, one can find a positive 
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Figure 4 Stable and unstable manifolds in the horseshoe map. 


integer N and a compact — ae U a the 
points p and q such that the pair (f^, I(f^, U)) is 
topologically conjugate to the full 2-shift (o, £2). 


In modern language, we can — that more 
is true. Let A(f) = le mip (I(f%, U)) be the f-orbit 
of the set I(f"*, U). Then, A(f) is a ped zero- 
dimensional bol basic set for f with 
V def Jo. nf (U) as an “adapted” or “isolating” 
neighborhood. This means that A(f)— (,ezf"(V 
Is a compact, zero-dimensional hyperbolic set (see 
Robinson (1999) for definitions and related refer- 
ences) contained in the interior of V and f | A(f) has 
* dense orbit. t g is C! near f, then 

g) def ar zg'(V) is a hyperbolic basic set for g 
d the pairs (f, AUS and (g, A(g)) are topologically 
conjugate. 

To get some appreciation for the magnitude of the 
contribution here,'one might note the complicated 
arguments employed by Poincaré at the end of 
Poincaré (1987) to show that so-called heteroclinic 
points (intersections between stable and unstable 
manifolds of saddles with different orbits) existed. 
Birkhoff found a symbolic description (using infinitely 
many symbols) of the orbits near a transverse 
homoclinic orbit from which the existence of both 
infinitely many periodic and heteroclinic points is 
obvious. Smale extended the treatment of transverse 
homoclinic points to all dimensions, and found the 
symbolic description (using two symbols for some 
iterate of the map) given above. Moreover, Smale 
proved the “robustness” of these structures: they persist 
under small C! perturbations. Note that Poincaré’s 
discovery of homoclinic points was in 1899, Birkhoff's 
results came in 1935, and Smale's results came in 
1965. Thus, the above advances took over 65 years! 

One can understand the geometry of Smale's 
construction fairly easily in the two-dimensional 
case. Let q be the transverse homoclinic point of the 
saddle fixed point p of the C" diffeomorphism f on 
the plane R^. Given a small neighborhood U of p, let 
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Figure 5 The curves /, c Ws(p) and h c W"(p). 


W^*(p, U) denote connected component of W*(p) f) U 
containing p, and define W"(p, U) similarly. We may 
choose C" coordinates (x,y) so that in some small 
neighborhood U of p, the point p corresponds to 
(0,0), the set W"(p, U) corresponds to (y —0), and 
the set W*(p, U) corresponds to (x —0). We assume 
that U is small enough that f in U is closely 
approximated by its derivative Df(o,o). Hence, f 
nearly contracts vertical directions and expands 
horizontal directions in U. 

Take compact arcs I; C W'(p) and h Cc W"(p) 
both containing the points p and q as in Figure 5. 

Let D be a curvilinear rectangle which is a slight 
thickening of I4. The forward iterates f'(D) will stay 
near I, for a while and then start to approach Ip. 
If we choose D appropriately, we can arrange for 
some high iterate f(D) to be a slight thickening 
of I; as in Figure 6. This looks geometrically like the 
horseshoe map. Let A; be the connected component 
of the intersection D(f"(D) containing p, and let 
Az be the connected component of the intersection 
D[(|f^"(D) containing q. These sets (which are 
shaded in Figure 6) play the role of the rectangles 
Ri and R2, respectively, in the horseshoe construc- 
tion. We use the set A; |) A» for U in Theorem 1. 


Figure 6 The curvilinear rectangle D and its Nth iterate f" (D) 
are geometrically like the horseshoe map. 


The Hénon Family 


To give explicit formulas for the horseshoe map 
above is somewhat tedious, and it is of interest to 
note that similar properties occur in maps with 
simple formulas. Indeed, such properties occur quite 
often in a well-known family of maps known as the 
*Hénon family." As we have mentioned, the map in 
Figure 1 provides an example. 

One may simply define a Hénon map as a 
diffeomorphism H = (Hij(x, y), H»(x, y)) with inverse 
G(x, y) =(Gi(x, y), Ga(x, y)) such that all the maps 
F;(x,y), G;(x,y) are polynomials of degree at most 
two. It is known (see, e.g., Friedland and Milnor 
(1989)) that such maps H have constant Jacobian 
determinant, and, up to affine conjugacy, may be 
represented in the form H= H, (x, y) ^ (a — q^ — 
by,x) with a, b constants and b-Z0. This makes 
sense when all the terms are real or complex. In the 
real case, we speak of the real Hénon family and, 
in the complex case, we speak of the complex 
Hénon family. 

The real Hénon family was first presented by the 
physicist M Hénon in 1976 as perhaps the simplest 
nonlinear diffeomorphism of the plane exhibiting a 
so-called *strange attractor." These mappings in the 
real and complex cases have been the focus of much 
attention. Our interest here is that, at least for 
certain parameters a, b, they provide concrete 
globally defined maps whose dynamics are analo- 
gous to that of the horseshoe diffeomorphism. In 
fact, Devaney and Nitecki (1979) proved (in the real 
case) that for fixed b #0, there is a constant ao > 0 
such that if a > ao, then the set Ba, of bounded 
orbits of H, ; is a compact zero-dimensional set and 
the pair (H,;,B,,) is topologically conjugate to 
(9,2). In addition, it can be shown that the 
invariant set B} p is a single hyperbolic h-closure. 
Analogous results are true for the complex Hénon 
family and proofs were originally given in the thesis 
of Ralph Oberste-Vorth (unpublished) under the 
supervision of John Hubbard at Cornell University. 
More recent proofs are in Newhouse (2004) and 
Hruska (2004). Many interesting results have been 
obtained for the complex Hénon map by Bedford 
and Smillie and Sibony and Fornaess (see the 
references in Hruska (2004). 


Homoclinic Points in Systems with 
Positive Topological Entropy 


There is an invariant of topological conjugacy which is 
known as the topological entropy. In a certain sense, 
this gives a quantitative measurement of the amount of 
complicated or chaotic motion in the system. 


Let f:X— X be a continuous self-map of the 
compact metric space (X,d). For a positive integer 
n > 0, we define an n-orbit to be a finite sequence 
O(x, n) = [x,f(x),...,f""' (x)). Given a positive real 
number e > 0, we say that two n-orbits O(x,m) and 
O(y, 1) are “e-distinguishable” if there isa 0 €; <n 
such that d(f!x, f'y) > e. Another way to look at this 
is the following. Define the so-called d,-metric on X 
by setting d,(x, y) — maxoz;-, d(f!x, f'y). Then, the 
two n-orbits O(x,7), O(y, n) are e-distinguishable if 
and only if d,(x, y) > e. It follows from compactness 
of X and the uniform continuity of each of the 
maps f',0 <j<n, that the number r(n,c,f) of 
c-distinguishable z-orbits is finite for each given «€ > 0 
and each positive integer n. We define the number 


b(f) — limlimsup - log r(1, e, f) 


0 no 


This means that, for some sequence of inte- 
gers nı <m «..., the map f has roughly ef) 
e-distinguishable x;-orbits for i large and e small. 

The number h(f) is called the topological entropy 
of the map f. It may be infinite for homeomorph- 
isms, but it is always finite for smooth maps on 
finite-dimensional manifolds. The number ^(f) has 
many nice properties. For instance, b(f"^) — Nb(f) 
for every positive integer N, and, if f is a homeo- 
morphism, then h(f~') — b(f). Further, if f and g are 
topologically conjugate, then h(f)=h(g). The so- 
called “variational principle for topological 
entropy" asserts that h(f) is the supremum of the 
measure-theoretic entropies of the invariant prob- 
ability measures for f. Our interest in this invariant 
here is the following theorem of Katok. 


Theorem 2(Katok). Let f be a C? diffeomorphism 
of a compact two-dimensional manifold M to itself 
with positive topological entropy. Then, f bas 
transverse bomoclinic points. 


In fact, Katok extended this theorem (see the 
supplement in Hasselblatt and Katok (1995)) to 
show that, if b(f) > 0 and e> 0, then there is a 
compact zero-dimensional hyperbolic basic set A for 
h such that h(f, A) > b(f) —«. Thus, one can find 
nice invariant topologically transitive sets for f (i.e., 
sets with dense orbits) on which the topological 
entropies of restriction of f are arbitrarily close to 
that of f. 

This theorem has the interesting consequence that 
the map f — b(f) is lower-semicontinuous on the 
space of C^ diffeomorphisms of a surface. It was 
proved in Newhouse (1989) (and, independently by 
Yomdin (1987)) that the map f — b(f) is upper- 
semicontinuous on the space of C* diffeomorphisms 
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of any compact manifold. Combining these results 
gives the theorem that the map f — h(f) is contin- 
uous on the space of C% diffeomorphisms on a 
compact surface, and that positivity of h(f) implies 
the existence of transverse homoclinic points. 

It is also worth noting that, for any continuous 
self-map f : M — M on a compact manifold M, one 
has the inequality h(f) >log|yu| where p is the 
eigenvalue of largest norm of the induced map f, 
on the first real homology group (Manning 1975). 
Putting this together with Theorem 2 gives the fact 
that there are whole homotopy classes of diffeo- 
morphisms on surfaces all of whose elements have 
transverse homoclinic points. For instance, consider 


a 2 x 2 matrix 
a b 
L=(¢ a) 


with integer entries, determinant 1, and eigenvalues 
i, Az with 0 « [| € 1 < Dl. Let L: T? —^ T? be 
the induced diffeomorphism on the two-dimensional 
torus T^. This is an example of what is called an 
“Anosov” diffeomorphism. In this case the number 
p above is simply Az, and this holds for any 
diffeomorphism f of T^ which can be continuously 
deformed into L. Hence, any such f must have 
transverse homoclinic points. 


Homoclinic Tangencies 


Let (5,4 €[0, 1]] be a parametrized family of C" 
diffeomorphisms of the plane with 入 an external 
parameter. It frequently occurs that there is a 
hyperbolic saddle fixed point ps for each parameter 
A moving continuously with A such that, at some 
value Ao, a homoclinic tangency is created at a point 
qo. This means that there are an c » 0, a small 
neighborhood U of qo, and curves 4$ C W"(p), 
7, € W*(p3) such that A (]53 —0 for o —e«A« 
Ao, ^, (10x, ={4o}, and yif] y consists of two 
distinct points for MN < Ac A--c«. In most cases, 
the tangency of 7%, and 4j, at qo will be of the 
second order, and we will assume that occurs here. 
The geometry is as in Figure 7. 


u u 
/ / g“ 
45 Js ? 
We A=Ng A>Ag 


Figure 7 Creation of a homoclinic tangency. 
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The creation of homoclinic tangencies is part of 
the general subject of *homoclinic bifurcations.” A 
recent survey of this subject is in the book by 
Bonatti et al. (2005). Typical results are the 
following. If p —p, is a saddle fixed point whose 
derivative is area-decreasing (i.e., |Det(Df(p))| < 1), 
then there are infinitely many parameters A near Ao 
for which each transverse homoclinic point of p; is a 
limit of periodic sinks (asymptotically stable peri- 
odic orbits) (Newhouse 1979, Robinson 1983). In 
addition, so-called strange attractors and SRB 
measures appear (Mora and Viana 1993). 

Finally, we mention that recently it has been 
shown that, generically in the C" topology for r > 2, 
homoclinic closures associated to a homoclinic 
tangency (in dimension 2) have maximal Hausdorff 
dimension (Theorem 1.6 in Downarowicz and 
Newhouse (2005 )). 


See also: Chaos and Attractors; Fractal Dimensions in 
Dynamics; Generic Properties of Dynamical Systems; 
Hyperbolic Dynamical Systems; Lyapunov Exponents 
and Strange Attractors; Saddle Point Problems; 
Singularity and Bifurcation Theory; Solitons and Other 
Extended Field Configurations. 
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Overview 


Renormalization theory is a venerable subject put to 
daily use in many branches of physics. Here, we 
focus on its applications in quantum field theory, 
where a standard perturbative approach is provided 
through an expansion in Feynman diagrams. Whilst 


the combinatorics of the Bogoliubov recursion, 
solved by suitable forest formulas, has been known 
for a long time, the subject regained interest on the 
conceptual side with the discovery of an underlying 
Hopf algebra structure behind these recursions. 
Perturbative expansions in quantum field theory 
are organized in terms of one-particle irreducible 
(1PI) Feynman graphs. The goal is to calculate the 
corresponding 1PI Green functions order by order in 
the coupling constants of the theory, by applying 
Feynman rules to these 1PI graphs of a 
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renormalizable theory under consideration. This 
allows one to disentangle the problem into an 
algebraic part and an analytic part. 

For the algebraic part, one studies Feynman graphs 
as combinatorial objects which lead to the Lie and 
Hopf algebras discussed below. Feynman rules then 
assign analytic expressions to these graphs, with the 
analytic structure of finite renormalized quantum field 
theory largely dictated by the underlying algebra. 

The objects of interest in quantum field theory are 
the 1PI Green functions. They are parametrized by 
the quantum numbers — masses, momenta, spin, and 
such — of the particles participating in the scattering 
process under consideration. We call a set of such 
quantum numbers an external leg structure r. For 
example, the three terms in the Lagrangian of 
massless quantum electrodynamics correspond to 


TI TE 1] 


Note that the Lagrangian L of massless quantum 
electrodynamics is obtained accordingly as 


Le = Bly + PE) e der 
= YO + yAv +4 F n 
where ó are coordinate space Feynman rules. 
The renormalized 1PI Green function in momen- 
tum space, G3 ((g]; (p), {22}; u), is obtained as the 


image under renormalized Feynman rules óg applied 
to a series of graphs: 


I —-14- gq =1+4 
k=l 


pr 
o2 amr P 


res(T )=r 


Here r is a given such external leg structure, while c; 
is the finite sum of 1PI graphs having k loops, 


e r 
‘= —— d 
gt PE [4] 
IT|=k 
and 0 < g < 1 is a coupling constant. The general- 
ization to the case of several couplings {g} and 
masses {m} is straightforward. In the above, the sum 
is over all 1PI graphs with the same given external 
leg structure. We have denoted the map which 
assigns r to a given graph a residue, for example, 


restate) = d. 5 


The unrenormalized but regularized Feynman rules 
@ assign to a graph a function 


O(T)({g}; ip). tm): uz) 


4 
- [Tl 64) >》 ky I] Prop(ke) e [6] 


vel) f incident v eer 
mit 


and formally the unrenormalized Green function 


Gi ({g}; {p}, {m}; us 2) 
= ġ(T*) {e}; tp). {m}; uz) [7] 


which is a function of a suitably chosen regulator z. 
Note that in [6] the four-dimensional Dirac-ó 
distribution guarantees momentum conservation at 
each vertex and restricts the number of four- 
dimensional integrations to the number of indepen- 
dent cycles in the graph. It is assumed that the 
reader is familiar with the readily established fact 
that these integrals suffer from UV singularities, 
which render the integration over the momenta in 
internal cycles ill-defined. We also remind the reader 
that the problem persists in coordinate space, where 
one confronts the continuation of products of 
distributions to regions of coinciding support. We 
restrict ourselves here to a discussion of the situation 
in momentum space and refer the reader to the 
literature for the situation in coordinate space. 
Ignoring problems of convergence in the sum over 
all graphs, the problem of renormalization is to 
make sense of these functions term by term: We 
have to determine invertible series Z'((g],z) in the 
couplings g such that the modified Lagrangian 


L= X Z({g},z) or) [8] 


produces a perturbation series in graphs that allows 
for the removal of the regulator z. 

This amounts to a transition from unrenorma- 
lized to renormalized Feynman rules ó — óg. Let us 
first describe how this transition is achieved using 
the Lie and Hopf algebra structure of the perturba- 
tive expansion, which is described in detail below: 


* Decide on the free fields and local interactions of 
the theory, appropriately specifying quantum 
numbers (spin, mass, flavor, color, and such) of 
fields, restricting interactions so as to obtain a 
renormalizable theory. 

e Consider the set of all 1PI graphs with edges 
corresponding to free-field propagators. Define 
vertices for local interactions. This allows one to 
construct a pre-Lie algebra of graph insertions. 
Antisymmetrize this pre-Lie product to get a Lie 
algebra £ of graph insertions and define the Hopf 
algebra H which is dual to the enveloping algebra 
U(L) of this Lie algebra. 

e Realize that the coproduct and antipode of this 
Hopf algebra give rise to the forest formula, 
which generates local counter-terms upon intro- 
ducing a Rota-Baxter map, a renormalization 
scheme in physicists’ parlance. 
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e Use the Hochschild cohomology of this Hopf 
algebra to show that one can absorb singularities 
in local counter-terms. 

e Determine the corepresentations of this Hopf 
algebra to identify the sub-Hopf algebras corre- 
sponding to time-ordered products in physical 
fields. This is most easily achieved by rewriting 
the Dyson-Schwinger equations using Hochschild 
1-cocycles. 


The last point exhibits close connections, in parti- 
cular, between the structure of gauge theories and 
the corepresentation theory of their perturbative 
Hopf algebras which we discuss below in brief. 

This program can be carried out in coordinate 
space as well as momentum space renormalization. 
It has given a firm mathematical background to the 
process of renormalization, justifying the practice of 
quantum field theory. The notion of locality has 
achieved a precise formulation in terms of the 
Hochschild cohomology of the perturbation expan- 
sion. In momentum space, this approach emphasizes 
the connections to number theory, which emerge 
when one investigates the role of the Hopf algebra 
primitives, which in turn furnish the Hochschild 
1-cocycles underlying locality. 

The next sections describe the above setup in 
some detail. 


Lie and Hopf Algebras of Graphs 


All algebras are supposed to be over some field K of 
characteristic zero, associative and unital, and 
similarly for coalgebras. The unit (and, by abuse of 
notation, also the unit map) will be denoted by I, 
the counit map by e. All algebra homomorphisms 
are supposed to be unital. A  bialgebra 
(A= Q7 o A;n, l, A, e) is called graded connected 
if AiÁ; C Ais; and A(A;)C Din =i Aj & Aj, and if 
A(D-IG&I and Ao — EL,e(I) 21 EK and 2-0 on 
CD; , Ai. We call ker e the augmentation ideal of A 
and denote by P the projection A — ker e onto the 
augmentation ideal, P —id — le. Furthermore, we 
use Sweedler's notation, A(b) — $^ h' & b", for the 
coproduct. We define Í 


Aug =| P@---@P | At, 
k times 9] 
A — {ker e}™ 


as a map into the k-fold tensor product of the 
augmentation ideal. We let A% = ker Aug'**!/ 
ker Aug, v k > 1. All bialgebras considered here 
are bigraded in the sense that 


A = Aa = Ha” [10] 
i=0 k=0 
where A) C oF 


*_,A" for all k > 1. Ap ~ AU! ~ K. 
The first construction we have to study is the pre- 
Lie algebra structure of 1PI graphs. 


The Pre-Lie Structure 


For each Feynman graph we have vertices as well as 
internal and external edges. External edges are edges 
that have an open end not connected to a vertex. 
They indicate the particles participating in the 
scattering amplitude under consideration and each 
such edge carries the quantum numbers of the 
corresponding free field. The internal edges and 
vertices form a graph in their own right. For an 
internal edge, both ends of the edge are connected to 
a vertex. 

We consider 1PI Feynman graphs. A graph T is 
I PI if and only if all graphs, obtained by removal of 
any one of its internal edges, are still connected. 
Such 1PI graphs are naturally graded by their 
number of independent loops, the rank of their 
first homology group Hjij(P,Z). We write |T| for 
this degree of a graph T. Note that |res(T)|= 0, 
where we let res(T) be the graph obtained when all 
edges in T n] shrink to a point, as before. Note that 
the graph obtained in this manner consists of a 
single vertex, to which the edges rll are attached. 

For a 1PI graph L,I?! denotes its set of 
vertices and TI 2 T! UTİ] its set of internal 
and external edges. In addition, let w, be the 
number of spacetime derivatives appearing in the 
corresponding monomial in the Lagrangian. 

Having specified free quantum fields and local 
interaction terms between them, one immediately 
obtains the set of 1PI graphs. One can then consider 
for a given external leg structure r the set of graphs 
with that external leg structure. For a renormaliz- 
able theory, we can define a superficial degree of 
divergence, 


w= M uw -A4Hg(I,Z)| [11] 


reT Uriel 
= init 


for each such external leg structure: w(D) — oT") if 
res(L) = res(T^); all graphs with the same external leg 
structure have the same superficial degree of 
divergence, and only for a finite number of distinct 
external leg structures r will this degree indeed 
signify a divergence. 

This leaves a finite number of external leg structures 
to be considered to which we restrict ourselves from 
now. Our first observation is that there is a natural 
pre-Lie algebra structure on 1PI graphs. 
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To this end, we define a bilinear operation 


iT = 9 n(Di Dj D)P [12] 
T 


where the sum is over all 1PI graphs T. Here, 
n(LU1,D5;D) is a section coefficient which counts 
the number of ways in which a subgraph [ can be 
reduced to a point in I such that Ti is obtained. The 
above sum is evidently finite as long as Ti and T2 
are finite graphs, and the graphs which contribute 
necessarily fulfill |T| = |T1| + |[2| and res(D) = res(T'1). 
One then has the following theorem. 


Theorem 1 The operation * is pre-Lie: 


Py * D2] «3 ^ P4 * [P5 * T3] 
= I * L3) * [> -— I x [T3 * MT [13] 


which is evident when one rewrites the *-product in 
suitable gluing operations. 

To understand this theorem, note that the 
equation claims that the lack of associativity in the 
bilinear operation * is invariant under permutation 
of the elements indexed 2,3. This suffices to show 
that the antisymmetrization of this map fulfills a 
Jacobi identity. Hence, we get a Lie algebra £ by 
antisymmetrizing this operation: 


Pi, D3] 2 P4. P5; 一 了 2 * T4 [14] 


This Lie algebra is graded and of finite dimension in 
each degree. Let us look at a couple of examples for 
pre-Lie products: 


dig te = [1 
wor xe xf 16 
wr [17] 
“i [18] 
Ore OO | 
xum = MOS 20] 


Together with £ one is led to consider the dual of its 
universal enveloping algebra U(£) using the theorem 
of Milnor and Moore. For this we use the above 
grading by the loop number. 

This universal enveloping algebra U(L) is built 
from the tensor algebra 


T-(DT', Th=L®-:-@L 21] 
k 


k times 


by dividing out the ideal generated by the relations 
a®b-—b@a=|a,blELl [22] 


Note that in U/(£) we have a natural concatenation 
product m,. Furthermore, U(£) carries a natural 
Hopf algebra structure with this product. For that, 
the Lie algebra £ furnishes the primitive elements: 


Vael [23] 


It is, by construction, a connected finitely graded 
Hopf algebra which is co-commutative but not 
commutative. We can then consider its graded 
dual, which will be a Hopf algebra H(m, I, A, e) 
that is commutative but not cocommutative. One 
finds it upon using a Kronecker pairing 


A,(a)= a®1+4+1®a, 


i, Der 


0, else [2A] 


<Zr,or >= | 
The space of primitives of U(L) is in one-to-one 
correspondence with the set Indec(74) of indecom- 
posables of H, which is the linear span of its 
generators. One finds the following theorem. 


Theorem 2 


«Zr, 0 Zr, -Zr 9 Zr, ôr > = «Zim, > [25] 


CEN FECI DU MNA 
= (Zr OZ i, Z Qj 32 
i») 
arde" ar p e] 


=2 (26) 


H is a graded commutative Hopf algebra which 
suffices to describe renormalization theory, as we 
see in the next section. We have formulated it for 
the superficially divergent 1PI graphs of the theory 
with the understanding that the residues of these 
graphs are in one-to-one correspondence with the 
terms in the Lagrangian of a given theory. Often, 
several terms in a Lagrangian correspond to graphs 
with the same number and type of external legs, but 
correspond to different form-factor projections of 
the graph. In such cases, the above approach can be 
easily adopted considering suitably colored or 
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labeled graphs. A similar remark applies if one 
desires to incorporate renormalization of super- 
ficially convergent Green functions, which requires 
nothing more than the consideration of an easily 
obtained semidirect product of the Lie algebra of 
superficially divergent graphs with the abelian Lie 
algebra of superficially convergent graphs. 


The Principle of Multiplicative Subtraction 


The above algebra structures are available once one 
has decided on the set of 1PI graphs of interest. We 
now use them toward the renormalization of any 
such chosen local quantum field theory. 

From the above, 1PI graphs T provide the linear 
generators dp of the Hopf algebra H= $% , Hi, 
where Hj, =span(6r) and their disjoint union 
provides the commutative product. 

Now let [ be a 1PI graph. We find the Hopf 
algebra H as described above to have a coproduct 
explicitly given as A: 71 —  & H: 


AT)=@1+1er+) year/y [27 


4cTr 


where the sum is over all unions of 1PI superficially 
divergent proper subgraphs, and we extend this 
definition to products of graphs so that we get a 
bialgebra. 

While the Lie bracket inserted graphs into each 
other, the coproduct disentangles them. It is this 
latter operation which is needed in renormalization 
theory: we have to render each subgraph finite before 
we can construct a local counter-term. That is precisely 
what the Hopf algebra structure maps do. 

Having a coproduct, two further structure maps 
of H are immediate: the counit and the antipode. 
The counit @ vanishes on any nontrivial Hopf 
algebra element, e(1) = 1, e(X) — 0. The antipode is 


ST) =P -X ' S(y)T/4 [28] 


ac 
We can work out a few coproducts and antipodes as 
follows: 


Aug?) ub re ie) — 2 xcrexbr [29] 
Aug? (4475) —2xL-8x—- BO 
Aug (gba) ="O" Sale BI] 
Aug^(«(D»-)-2-xL-&-O» [32] 


Aug? («Pee Que) - 23r OO” — [33 


Aug (XO) =O eur [34] 


We give just one example for an antipode: 


See) = O2 i O BS 


Note that for each term in the sum A(T) = Y7; Dr 8 
DI, we have unique gluing data G; such that 


D-I45cg Vi 36] 


These gluing data describe the necessary bijections 
to glue the components I, back into T so as to 
obtain T': using them, we can reassemble the whole 
from its parts. Each possible gluing can be inter- 
preted as a composition in the insertion operad of 
Feynman graphs. 

We have by now obtained a Hopf algebra 
generated by combinatorial elements, 1PI Feynman 
graphs. Its existence is automatic from the above 
choices of interactions and free fields. What remains 
to be done is a structural analysis of these algebras 
for the renormalizable theories we are confronted 
with in four spacetime dimensions. 

The assertion underlying perturbation theory is 
the fact that meaningful approximations to physical 
observable quantities can be found by evaluating 
these graphs using Feynman rules. 

First, as disjoint scattering processes give rise to 
independent amplitudes, one is led to the study of 
characters of the Hopf algebra, maps $: 1 — V such 
that dom=my(¢d® ¢). 

Such maps assign to any element in the Hopf 
algebra an element in a suitable target space V. 
The study of tree-level amplitudes in lowest-order 
perturbation theory justifies assigning to each edge 
a propagator and to each elementary scattering 
process a vertex, which define the Feynman rules 
ó(res(LU)) and the underlying Lagrangian, on the 
level of residues of these very graphs. Graphs are 
constructed from edges and vertices which are 
provided precisely by the residues of those diver- 
gent graphs, hence one is led to assign to each 
Feynman graph an evaluation in terms of an 
integral over the continuous quantum numbers 
assigned to edges or vertices, which leads to the 
familiar integrals over momenta in closed loops 
mentioned before. 

Then, with the Feynman rules providing a 
canonical character ó, we will have to make one 
further choice: a renormalization scheme. The need 
for such a choice is no surprise: after all we are 
eliminating short-distance singularities in the graphs, 
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which renders their remaining finite part ambiguous, 
albeit in a most interesting manner. 

Hence, we choose a map R: V — V, from which 
we obviously demand that it does not modify the 
UV-singular structure, and furthermore that it obeys 


R(xy) + R(x)R(y) = R(R(x)y) + R(xR(y)) | [37] 


which guarantees the multiplicativity of renormali- 
zation and is at the heart of the Birkhoff decom- 
position, which emerges below: it tells us that 
elements in V split into two parallel subalgebras 
given by the image and kernel of R. Algebras for 
which such a map exists are known as Rota-Baxter 
algebras. The role Rota-Baxter algebras play for 
associative algebras is similar to the role Yang- 
Baxter algebras play for Lie algebras. The structure 
of these algebras allows one to connect renormaliza- 
tion theory to integrable systems. In addition, most 
of the results obtained initially for a specific 
renormalization scheme, such as minimal subtrac- 
tion, can also be obtained, in general, upon a 
structural analysis of the corresponding Rota—Baxter 
algebras. 

To see how all the above comes together in 
renormalization theory, we define a further char- 
acter S; that deforms $ o S slightly and delivers the 
counter-term for I in the renormalization scheme R: 


S$ (T) = —Rmy(S2 & oo P)A 


= —R[e(T)] EXT Se (y 


3cr 


o(T'/7) [38] 


which should be compared with the undeformed 


QoS -—my(Soo& ooP)A 
- —é(D)-^ óeS(y)é(T/y) BI 
7CL 
The fact that R is a Rota-Baxter map ensures that 
SR is an tm of the character group G of the 


Hopf algebra, y € Spec(G). Note that we have now 
determined the modified Lagrangian: 


Zt = S&(I*) (40 


The classical results of renormalization theory 
follow immediately using this group structure: we 
obtain the renormalization of T by the application 
of a renormalized character 


SR * $T) = my(Sp @ dA [41] 


and Bogoliubov’s R operation as 


RTI = + ® $)(id ® m (r) 


) -- 》 SSQ)9(T/4) [42] 


vcr 


so that 


Sj « é(T) = R(T) + SET) 43 
Here, S$ x ó is an element in the group of characters 
of the Hopf algebra, with the group law given by the 
convolution 
hı x ġ2 —myo(ói&02)oA [44] 

so that the coproduct, counit, and coinverse (the 
antipode) give the product, unit, and inverse of this 
group, as befits a Hopf algebra. This Lie group has 
the previous Lie algebra £ of graph insertions as its 
Lie algebra: £ exponentiates to G. 

What we have achieved above is a local renorma- 
lization of quantum field theory. Let M* be a 
monomial in the Lagrangian L of degree w,: 


p= D,{¢} [45] | 


Then one can prove, using the Hochschild cohomol- 
ogy of H: 


Theorem 3 (Locality) 


Z'D,(ó) = D,Z'(ó) 46] 


that is, renormalization commutes with infinitesimal 
spacetime variations of the fields. 


We can now work out the renormalization of a 
Feynman graph T: 


AQ =~» 81-19 -«D- 
2e @ D)» [47] 


$0 = oD») 258 5)0(-O-) — 148] 
= (D>) -2RléGi-)|e(-O- (49) 
Sk“) = -R[ecQ»] [50] 


= [id — R] o OCO [51] 


The formulas [47]-[51] are given in their recursive 
form. Zimmermann’s original forest formula solving 
this recursion is obtained when we trace our 
considerations back to the fact that the coproduct 
can be written in nonrecursive form as a sum over 
forests, and similarly for the antipode. 
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Diffeomorphisms of Physical Parameters 


In the above, we have effectively obtained a Birkhoff 
decomposition of the Feynman rules ó € Spec(G) 
into two characters 一 oR =Sexge Spec(G) and 
pR =M € Spec(G) — for any Rota-Baxter map R. 
Thanks to Atkinson’s theorem, this is possible for 
any renormalization scheme R. For the minimal 
subtraction scheme, it amounts to the decomposi- 
tion of the Laurent series ¢(I')(€), which has poles of 
finite order in the regulator e, into a part holo- 
morphic at the origin and a part holomorphic at 
complex infinity. This has a particularly nice 
geometric interpretation upon considering the 
Birkhoff decomposition of a loop around the origin, 
providing the clutching data for the two half-spheres 
defined by that very loop. 

Whilst in this manner a satisfying understanding 
of perturbative renormalization is obtained, the 
character group G remains rather poorly under- 
stood. On the other hand, renormalization can be 
captured by the study of diffeomorphisms of 
physical parameters as, by definition, the range of 
allowed modification in renormalization theory is 
determined by the variation of the coefficients of 
monomials ó(r) of the underlying Lagrangian 


L= 2. Z'ó(r) [52] 


Thus, one desires to obtain the whole Birkhoff 
decomposition at the level of diffeomorphisms of the 
coupling constants. 

The crucial step toward that goal is to realize the 
role of a standard quantum field-theoretic formula 
of the form 


Snew — Bolder [53] 
where 
Z” 


gu s TY Le di 
| sai x 


for some vertex v, which obtains the new coupling 
in terms of a diffeomorphism of the old. This 
formula provides, indeed, a Hopf algebra homo- 
morphism from the Hopf algebra of diffeomorph- 
isms to the Hopf algebra of Feynman graphs, 
regarding Zs (a series over counter-terms for all 
IPI graphs with the external leg structure corre- 
sponding to the coupling g), in two different ways: it 
is, at the same time, a formal diffeomorphism in the 
coupling constant gold and a formal series in Feyn- 
man graphs. As a consequence, there are two 
competing coproducts acting on Z,. That both give 
the same result defines the required homomorphism, 


which transposes to a homomorphism from the 
largely unknown group of characters of H to the 
one-dimensional diffeomorphisms of this coupling. 

In summary, one finds that a couple of basic 
facts enable one to make a transition from the 
abstract group of characters of a Hopf algebra of 
Feynman graphs (which, incidentally, equals the Lie 
group assigned to the Lie algebra with universal 
enveloping algebra the dual of this Hopf algebra) to 
the rather concrete group of diffeomorphisms of 
physical observables. These steps are given as 
follows: 


è Recognize that Z factors are given as counter- 
terms over a formal series of graphs starting with 
1, graded by powers of the coupling, hence 
invertible. 

e Recognize the series Z, as a formal diffeomorph- 
ism, with Hopf algebra coefficients. 

e Establish that the two competing Hopf algebra 
structures of diffeomorphisms and graphs are 
consistent in the sense of a Hopf algebra 
homomorphism. 

e Show that this homomorphism transposes to a Lie 
algebra and hence Lie group homomorphism. 


The effective coupling g.g(c) now allows for a 
Birkhoff decomposition in the space of formal 
diffeomorphisms. 


Theorem 4 Let tbe unrenormalized effective cou- 
pling constant geff(E) viewed as a formal power 
series in g be considered as a loop of formal 
diffeomorphisms and let geele) = (get ) (€) ges, (€) 
be its Birkhoff decomposition in the group of formal 
diffeomorphisms. Then tbe loop g (€) is the bare 
coupling constant and geff (0) is the renormalized 
effective coupling. 


The above results hold as they stand for any 
massless theory which provides a single coupling 
constant. If there are multiple interaction terms 
in the Lagrangian, one finds similar results relat- 
ing the group of characters of the corresponding 
Hopf algebra to the group of formal diffeomorph- 
isms in the multidimensional space of coupling 
constants. 


The Role of Hochschild Cohomology 


The Hochschild cohomology of the combinatorial 
Hopf algebras which we discuss here plays three 
major roles in quantum field theory: 


1. it allows one to prove locality from the accom- 
panying filtration by the augmentation degree 
coming from the kernels ker Aug"); 
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2. it allows one to write the quantum equations of 
motion in terms of the Hopf algebra primitives, 
elements in Hiin N [ker Aug ""/ker Aug"’’}; and 

3. it identifies the relevant sub-Hopf algebras 
formed by time-ordered products. 


Before we discuss these properties, let us first 
introduce the relevant Hochschild cohomology. 


Hochschild Cohomology of Bialgebras 


Let (A,m,1,A,e) be a bialgebra, as before. We 
regard linear maps L:A— A*" as n-cochains and 
define a coboundary map b,b? =0 by 


bL :-(id& L)o A4 S (-1)'AioL 


i=1 
+(-1)""*Le@l [55] 


where A; denotes the coproduct applied to the ;th 
factor in A?", which defines the Hochschild coho- 
mology of A. 

For the case n= 1, for L: A— A, [55] reduces to 


bL = (id@L)oA—AoL+Le@l [56] 


The category of objects (A, C), which consists of 
a commutative bialgebra A and a Hochschild 
I-cocycle C on A, has an initial object (Hm, B+), 
where Hy is the Hopf algebra of (nonplanar) rooted 
trees, and the closed but nonexact 1-cocycle B, 
grafts a product of rooted trees together at a new 
root as described below. 

The higher (n > 1) Hochschild cohomology of Hrt 
vanishes, but in what follows, the closedness of B. 
will turn out to be crucial. 


The Hopf Algebra of Rooted Trees 


A rooted tree is a simply connected contractible 
compact graph with a distinguished vertex, the root. 
A forest is a disjoint union of rooted trees. 
Isomorphisms of rooted trees or forests are iso- 
morphisms of graphs preserving the distinguished 
vertex/vertices. Let 7 be a rooted tree with root o. 
The choice of o determines an orientation of the 
edges of t, away from the root, say. Forests are 
graded by the number of vertices they contain. 

Let Hea be the free commutative algebra generated 
by rooted trees. The commutative product in Hy 
corresponds to the disjoint union of trees, such 
that monomials in Hr are scalar multiples of forests. 


We demand that the linear operator B, on Ha, 
defined by 


B,(I) =e [57] 


B. (ty ...t4) = J, [58] 


ty re n 


is a Hochschild 1-cocycle, which makes H,, a Hopf 
algebra. The resulting coproduct can be described as 
follows: 


A(t)=I@t+t@el+ >》 P(t) R(t) [59] 


adm c 


where the sum goes over all admissible cuts of the 
tree £. Such a cut of f is a nonempty set of edges of i 
that are to be removed. The forest which is 
disconnected from the root upon removal of those 
edges is denoted by P,(£) and the part which remains 
connected to the root is denoted by R,(t). A cut c(t) 
is admissible if, for each vertex / of £, it contains at 
most one edge on the path from / to the root. 

This Hopf algebra of nonplanar rooted trees is the 
universal object after which all such commutative 
Hopf algebras H providing pairs (^4, B), for B a 
Hochschild 1-cocycle, are formed. 


Theorem 5 The pair (H4,B,), unique up to 
isomorphism, is universal among all such pairs. In 
other words, for any pair (H,B) where H is a 
commutative Hopf algebra and B a closed nonexact 
l-cocycle, there exists a unique Hopf algebra 
morphism Ha &H such that Bo p=poB,. 


This theorem suggests that we investigate the 
Hochschild cohomology of the Hopf algebras of 1PI 
Feynman graphs. It clarifies the structure of 1PI 
Green functions. 


The Roles of Hochschild Cohomology 


The Hochschild cohomology of the Hopf algebras of 
1PI graphs sheds light on the structure of 1PI Green 
function in at least four different ways: 


e it gives a coherent proof of locality of counter- 
terms — the very fact that 


iZ^, D,] = 0 [60] 


means that the coefficients in the Lagrangian 
remain independent of momenta, and hence the 
Lagrangian remains a polynomial expression in 
fields and their derivatives; 

è the quantum equation of motions takes a very 
succinct form, identifying the Dyson kernels with 
the primitives of the Hopf algebra; 

e sub-Hopf algebras emerge from the study of the 
Hochschild cohomology, which connects the repre- 
sentation theory of these Hopf algebras to the 
structure of theories with internal symmetries; and 

e these Hopf algebras are intimately connected to 
the structure of transcendental functions, such as 
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the generalized polylogarithms, which play a 
prominent role these days ranging from applied 
particle physics to recent developments in 
mathematics. 


To determine the Hochschild 1-cocycles of some 
Feynman graph Hopf algebra H, one determines 
first the primitives graphs 7 of the Hopf algebra, 
which, by definition, fulfill the condition 


AT) =7@I+IQ@1T [61] 


Using the pre-Lie product above, one then deter- 
mines the maps 


B? : H — Hin [62] 
such that 
B? (h) = B? (h) @1+ (id @ B1)A(b) [63] 


where Bi (h)= Y n(y,b,V)T. The coefficients 
n(y,b,T) are closely related to the section coeffi- 
cients noted earlier. 

Using the definition of the Bogoliubov map @, this 
immediately shows that 


S$ (Bi(b)) = f D, — c, br(b) [64] 


which proves locality of counter-terms upon recog- 
nizing that B; increases the augmentation degree. 
Here, the insertion of the functions for the subgraph 
is achieved using the relevant gluing data of [36]. 

To recover the quantum equation of motions from 
the Hochschild cohomology, one proves that 


F t 
[! 214 2. Syma) BY (X.) [65] 


where 


X, = lI I [66] 


Le 
€^] lint vey 


has the required solution. Upon application of the 
Feynman rules, the maps B} turn into the integral 
kernels of the usual Dyson-Schwinger equations. 
This allows for new nonperturbative approachés 
which are a current theme of investigation. 

Finally, we note that the 1-cocycles introduced 
above allow one to determine sub-Hopf algebras of 
the form 


A(z) =>) Pch @g [67] 


where the G are defined in eqn [3]. These algebras 
do not necessitate the considerations of single 


Feynman graphs any longer, but allow one to 
establish renormalization directly for the sum of all 
graphs at a given loop order. Hence, they establish a 
Hopf algebra structure on time-ordered products in 
momentum space. For theories with internal sym- 
metries, one expects and indeed finds that the 
existence of these subalgebras establishes relations 
between graphs that are same as the Slavnov—Taylor 
identities between the couplings in the Lagrangian. 


Outlook 


Thanks to the Hopf and Lie algebra structures 
described above, quantum field theory has started to 
reveal its internal mathematical structure in recent 
years, which connects it to a motivic theory and 
arithmetic geometry. Conceptually, quantum field 
theory has been the most sophisticated means by 
which a physicist can describe the character of the 
physical law. We have slowly begun to under- 
standing that, in its short-distance singularities, it 
encapsulates concepts of matching beauty. We can 
indeed expect local point-particle quantum field 
theory to remain a major topic of mathematical 
physics investigations in the foreseeable future. 
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Introduction 


Quantum groups are a remarkable generalization of 
conventional groups using an algebraic language by 
now quite well known to mathematical physicists. 
This language is first and foremost the concept of a 
*Hopf algebra." In fact, the axioms of a Hopf 
algebra are so attractive from a mathematical point 
of view that they were proposed in the 1940s long 
before the advent of truly representative examples, 
which did not come until the 1980s (from mathe- 
matical physics). Until then, they were used mainly 
by mathematicians as a way for redoing group 
theory and Lie algebra theory in a more uniform 
way. 

It is remarkable that at least three points of view 
lead to the same axioms of a Hopf algebra: 


1. Generalized symmetry A generalization of a 
usual group algebra or enveloping algebra of a 
Lie algebra that can nevertheless act on other 
algebraic objects. The structure that controls this 
is the “coproduct” A: H—H & H, while the 
group or Lie structure is encoded in the algebra 
H which is typically not changed up to iso- 
morphism. A allows H to act on tensor products 
and this is needed to define what it means, for 
example, for a product A ® A— A of an algebra 
to be an intertwiner. The usual flip map between 
two representations V & W —^W &V is not 
typically an intertwiner any more, instead that 
is provided by an R-matrix solving the Yang- 
Baxter equations (YBE). 

2. Noncommutative geometry A generalization of 
the coordinate algebra of functions on a conven- 
tional group to allow noncommutative or “quan- 
tum" coordinate algebras. Here the group 
structure is encoded in a coproduct A: H —> H & 
H in a way which would, in the case of functions 
on a group, be defined by the group product. It is 
typically not changed, the change being in the 
algebra. 

3. Duality An object that admits observer- 
observed duality or Fourier transform. Such a 
duality is known for abelian groups, lost for 
nonabelian groups but re-emerges for Hopf 
algebras. If there is to be an algebra with product 
H&H-H, then there should also be a 


“coproduct” A:H—H &H to maintain the 
duality symmetry. Then a suitable dual space 
H* is also a Hopf algebra, with the roles of 
product and coproduct interchanged. 


In line with these main ideas are three known classes 
of true quantum groups, and these remain the main 
types of example at the time of writing: the q-deformed 
enveloping algebras U,(g) of Drinfeld and Jimbo, their 
duals as quantizations of the Drinfeld-Sklyanin 
Poisson bracket on a simple Lie group (both of these 
arising from quantum inverse scattering but also in 
the case of C,[SU2] from C*-algebras) and the 
bicrossproduct quantum groups based on Lie group 
factorizations (arising from ideas for Planck-scale 
physics and quantum gravity). The latter are self-dual 
and hence are both generalized symmetries and 
noncommutative or quantum geometries at the same 
time. The impact of such quantum groups has been 
very far reaching from a mathematician's point 
of view, spanning revolutions in the theory of knot 
and 3-manifold invariants, Poisson geometry, new 
directions in noncommutative geometry, to name 
some. In physics they are, at the time of writing, 
beginning seriously to be applied in a variety of 
contexts beyond the original ones, such as in book- 
keeping overlapping divergences in general quantum 
field theories, quantum computing, and construction 
of anyons. This article will mention some of these, but 
just as groups have many different roles in physics, 
one can expect that quantum groups and variants of 
them can and will have diverse roles as well. What 
follows is a short overview. 


Hopf Algebras and First Examples 


The general theory works over any field k but (to be 
concrete) we write our examples over C; one can 
also have examples over, say, the field Z2 of two 
elements. A Hopf algebra then is: 


1. An algebra H with unit which is also a 
“coalgebra” with counit, that is, there are maps 
A:H—HGH,e:H — k obeying: 


(A @id)A = (id & A)A 
(e @id)A = (id @€)A = id 


2. A, e should be algebra homomorphisms. 
3. There should be a map $:H —H called the 


antipode or *linearized inverse" obeying 


(id & S)A = -(S @id)A = 1e 
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If the third axiom is not obeyed one has a 
“quantum semigroup” or “bialgebra.” Note also 
that S looks nothing like a usual inverse and it is 
not, yet it plays the same role. For example, we can 
define conjugation or the “adjoint action" of any 
Hopf algebra on itself by 


Ad, (b) = >》_ a(1)bSai2), Aa = X an) 69 d(2) 


where we use here the *Sweedler notation" for Aa a 
sum of unspecified pieces in H & H. Moreover, if it 
exists, then 5 is unique and (it can be shown) 
S(ab) = (Sb)S(a) for all a, b € H, just like an inverse. 

The self-duality of these axioms is evident from 
the first one: a coalgebra is just an algebra with its 
product map H & H — H, unit element (viewed as a 
map k — H sending 1 to 1) and the associativity and 
unity axioms all written backwards. Meanwhile, 
the middle axiom means in explicit terms 
A(ab) = (A^a)(Ab),c(ab) —-«e(a)e(b) for all a,be H 
and A(1)—14&1,«(1) — 1. This may not look self- 
dual but it is equivalent to saying that the product 
and unit are coalgebra homomorphisms. Indeed, if 
one takes the trouble to write out all the axioms as 
commutative diagrams, the set of axioms is invar- 
iant under arrow reversal. Such arrow reversal can 
also be concretely implemented, for example, by 
taking adjoints. Thus, the coproduct dualizes to a 
map (H & H)' —^ H* and since H* ® H* C (H & H) 
we have a product on the dual H*. If the dual space 
is defined correctly, one also has a coproduct by 
dualizing the product, etc. One says that two Hopf 
algebras H, H' are “in duality” if their maps are 
adjoint to each other in such a way. 

The role of quantum groups as generalized 
symmetries is typified by the following examples. 
Thus, let G be a group; then its group algebra CG 
defined as a vector space (written here over C) with 
basis identified with G and product given by the 
group product extended linearly, is a Hopf algebra 
with 


Ag—gGg g=1, $g—g EC 


Likewise, if q is a Lie algebra, then its universal 
enveloping algebra U(q) generated by q is a Hopf 
algebra with 


AF=E@14+10E& &=0, SE—-E, VEER 


The two examples are related if one informally 
allows exponentials, then g =ef has coproduct 


Ack = Cet MN: 
using axiom 2 and that € & 1, 1 $ £ commute in the 
tensor product algebra. 


The coproduct structures are therefore implicit 
already in Lie theory and group theory. As for any 


Hopf algebra A, specifies how the algebra H acts in 
a tensor product of two representations. For groups 
the tensor product is diagonal (g acts on each copy), 
for Lie algebras it is additive (e.g., the addition of 
angular momenta). In general, the action of a € H is 
defined as the action of Aa on the tensor product. 
This has far-reaching consequences. For example, 
for the product A®A—A of an algebra to be 
covariant means that H acting before and after the 
product map gives the same answer, similarly for the 
unit map where k has the trivial representation 
afforded by ce, that is, 


b» (ab) 2 > (bae a)(boye b), bol =e(h)1 


for all a,b € A and heH. What that means in the 
case of a group is therefore g > (ab) = (g > a)(g > b) or 
G acts by automorphisms. What it means for a Lie 
algebra is £ » (ab) — (£r a)b -- a(£ » b), that is, g acts 
by derivations. This is how Hopf algebra theory 
unifies group theory and Lie algebra theory and 
potentially takes us beyond. 

In another, dual, point of view, if G is a group 
defined by polynomial equations in C", then the 
Hilbert's “nullstellensatz” in algebraic geometry says 
that it corresponds algebraically to a commutative 
nilpotent-free algebra with m generators, called its 
"coordinate algebra" H — C[G]. The group product 
then corresponds to A making C[G] into a Hopf 
algebra. If one replaces C by any field, one has an 
algebraic group over the field. For example, the 
group SL;(C) C C^ has coordinate algebra gener- 
ated by four functions a,b,c,d where a at matrix 
g € SL;(C) has value gi; the 1,1 entry of the matrix, 
similarly b(g) — gi; etc. Then C[SL5;] is the commu- 
tative algebra generated by a, b, c, d with the relation 
ad — bc—1. A little thought about matrix multi- 
plication should convince the reader that 


a b a b\ f(a b 
ale 5) (e ade a) 


where we have written the operation on each 
generator as an array and where matrix multi- 
plication is understood (so Aa =a & a +b & c, etc.). 
The counit and antipode are 


(4 时 E ( 1 1) 
"Ae d) \Oot 
a b d —b 
S = 
( d C a 
One could also let G be a finite group, in which case 


the algebra C(G) of (say complex-valued) functions 
on it is more obviously a Hopf algebra with 


(Aa)(g, b) = a(gh), e(a) =a(1), (Sa)(g) = a(g ') 
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for any function a € C(G). Here we identify C(G) @ 
C(G) — C(G x G) or functions in two variables on 
the group. These examples are dually paired with 
U(a) in the Lie case and CG in the finite case, 
respectively. 

In such-a coordinate algebra point of view, usual 
constructions in group theory appear expressed 
backwards with arrows reversed. So an action of 
the group appears for such a Hopf algebra H as a 
“coaction” Ag: V— V&H (here a right coaction, 
one can similarly have A, a left coaction). It obeys 


(Ag Gid)Ag = (id& A)Ag, (id @e)Ag — id 


which are the axioms of an algebra acting written 
backwards for the coalgebra of H “coacting.” An 
example is the right action of a group on itself which 
in the coordinate ring point of view is Ag — A, that 
is, the coproduct viewed as a right coaction. It is the 
algebra of H that determines the tensor product of 
two coactions, so, for example, A is a coaction 
algebra in this sense if Ap: A— A & H is a coalgebra 
and an algebra homomorphism. Similarly, in this 
coordinate point of view, an integral on the group 
means a map [:H — k and right invariance trans- 
lates into invariance under the right coaction, or 


([ sia) i] 


There is a theorem that such an integration, if it 
exists, is unique up to scale. In the finite-dimensional 
case it always exists, for any field k. At least in this 
case, let exp = Y; e; & f! for a basis {e;} of H and [f] 
a dual basis. Then an application of the integral is 
Fourier transform H — H* defined by 


F(a) =| Seaef 


with properties that one would expect of Fourier 
transform. The inverse is given similarly the other 
way up to a normalization factor and using the 
antipode of H. This is one among the many results 
from the abstract theory of Hopf algebras, see 
Sweedler (1969) and Larson and Radford (1988) 
among others. 

A given Hopf algebra H does not know which 
point of view one is taking on it; the axioms of a 
Hopf algebra include and unify both enveloping and 
coordinate algebras. So an immediate consequence is 
that constructions which are usual in one point of 
view give new constructions when the wrong point 
of view is taken (put another way, the self-duality of 
the axioms means that any general theorem has a 
second theorem for free, given, if we keep the 
interpretation of H fixed, by reversing all arrows in 


the original theorem and its proof) Even the 
elementary examples above are quite interesting for 
physics if taken “upside down" in this way. For 
example, if G is nonabelian, then CG is noncom- 
mutative, so it cannot be functions on any actual 
group. But it is a Hopf algebra, so one could think 
of it as being like C(G), where G is not a group but 
a quantum group defined as C(G) — CG. The latter 
is a well-defined Hopf algebra viewed the wrong 
way. So this is an application of noncommutative 
geometry to allow nonabelian Fourier transform 
F:C(G)— CG. Similarly, U(q) is noncommutative 
but one could view it upside down as a quantization 
of C[a*] = S(q) (the symmetric algebra on a). To do 
this let us scale the generators of q so that the 
relations on U(a) have the form £r — n= A[£, n] 
where A is a deformation parameter. Then the 
Poisson bracket that this algebra quantizes 
(deforms) is the Kirillov-Kostant one on q* where 
lé n= [én]. Here €,7 on the left-hand side are 
regarded as functions on q*, while on the right-hand 
side we take their Lie bracket and then regard 
the result as a function on q*. Examples which 
have been used successfully in physics include: 


It xs] = Dex; (bicrossproduct model Ris ) 


[xix] = 1D Aes xg (spin space model Rà) 


(summation understood over k). In both cases, we 
may develop geometry on these algebras using 
quantum group methods as if they were coordinates 
on a usual space (see Bicrossproduct Hopf Algebras 
and Noncommutative Spacetime). They are versions 
of R” because the coproduct which expresses the 
addition law on the noncommutative space is the 
additive one according to the above. In the second 
case, setting the Casimir to the value for a spin j is the 
quadratic relation of a *fuzzy sphere." As algebras, 
the latter are just the algebras of (2j + 1)x(2j + 1) 
matrices. 

Going the other way, we can take a classical 
coordinate ring C[G] and regard it upside down as 
some kind of group or enveloping algebra but with 
a nonsymmetric A. In the finite group case, an 
action of C(G) just means a G-grading. Here if an 
element v of a vector space has G-valued degree |v| 
then abv-a(||)v is the action of a € C(G). 
Alternatively, this is the same thing as a right 
coaction of CG, Agv —v ® |v|. Thus, the notion of 
group representation and group grading are also 
unified. This is familiar in physics for abelian 
groups (a U(1) action is the same thing as a 
Z-grading) but works fine using Hopf algebra 
methods for nonabelian groups and beyond. 
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Returning to axioms, if one wants to speak of real 
forms and unitary representations, this corresponds, 
for Hopf algebras, to H a *-algebra over C with 


A'( )— TUE aA, *o$—$ lox 


where 7 (throughout this article) denotes transposi- 
tion of tensor factors. This requires in particular that 
S in invertible (which is not assumed for a general 
Hopf algebra though it does hold in the finite- 
dimensional case and in all examples of interest). 
Thus, C[SU;] denotes the above with a certain x 
structure whereby the matrix of generators is unitary. 


q-Deformation Enveloping Algebras 


For a genuinely representative example of a Hopf 
algebra, consider, U,(sl;) defined with noncommu- 
tative generators and relations, coproduct etc., 


gt 5x ,q ^. = gH, 


h —h 
—4q 

| a a 
| + | =g 

D = x &q'" pg? © X+ 

Aq"? ur g" Q q^? 
EX = 0, eq"? zx; 
Sx. = —gq*ix,, Sq"? = g "P 


The actual generators here are xi,g*”’/? but the 
notation is intended to be suggestive: if h existed and 
we took the limit q— 1, we would have the usual 
enveloping algebra of the Lie algebra sh. The 
quantum group U,;(su;) is the same with the 
*-structure h* — b, x^ — x4 when q is real (there are 
other possibilities). 

Two words of warning here. Although some 
authors write q = e"/?, the parameter q here has little 
to do with quantization. In fact, the cases of direct 
relevance to physics are g*™/(2+*), where k is the level 
of the Wess-Zumino- Witten (WZW) model in which 
this quantum group appears as a generalized symme- 
try. This quantum group also (first) appeared in the 
theory of exactly solvable lattice models, namely the 
Ising model with an applied external magnetic field: 
q #1 is a measure of the resulting nonhomogeneity 
of the model. Its origins go further back to the 
algebraic Bethe ansatz and the emergence of the YBE 
in such models (Baxter 1982). The general U,(q) 
emerged from this context in Drinfeld (1987) and 
Jimbo (1985) and the same remark applies (see Affine 
Quantum Groups; Yang—Baxter Equations). 

The second warning is that at least informally (if 
one works with H and allows formal power series 


etc.), the algebra here is isomorphic to usual U(sh), 
that is, it looks deformed but the true deformation is 
not here but in the coproduct, which enters into the 
tensor product of representations. The latter are 
labeled as usual because the algebra is not really 
changed, for example, the unitary ones of U,;(su;) 
are labeled by spin. The spin- one even looks the 
same with xs, represented by the standard Pauli 
matrices. Tensor products of representations start 
to look different but their multiplicities are the same 
as classically and if V,W are representations then 
V@2W2We®V. Because the coproduct above is 
not symmetric in its two factors, this isomorphism 
Vy wy-—roRy,w has Ry,w nontrivial. From the 
formulas given, the reader can compute that 


q 0 0 0 
" 0 1 =q 0 
R4/2472 = d LE 0 0 4 0 


00 0 4 


in a tensor product basis. For this particular 
quantum group, and others like it, one finds that 
these “R-matrices” obey the braid relations as a 
version of the YBE. As a result, they can and do lead 
to knot invariants; the one above leads to the Jones 
knot invariant as a polynomial in q. Briefly, one 
represents the knot on a plane, assigns R or R^! to 
each braid crossing and takes a suitable trace (see 
The Jones Polynomial). 

Since such features hold in any representation, 
these matrices are in fact representations of an 
invertible element R € H & H provided one allows ^ 


as a generator and formal power series: 
-aeg " 
R = qireh)/2 em ef. e= xq", f eg Ix 


where 


oo x" 1 uH g” 
«7i wp Pg 


are the q-exponential and q-integer, respectively. 
Their proper explanation is in the section “Braided 
groups and quantum planes." This R is called the 
“universal R-matrix” or quasitriangular structure 
and obeys 


TA =R(A JR! 
(A @ id) R = Ri3 R23, (id & A)R = Ri3R12 


and from the axioms of a Hopf algebra, one may 
deduce that the YBE 


RRi Raz = Ra3 Riz R12 


hold in the algebra. This induces the YBE for 
matrices Ry w in the representation V & W. Such a 
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Hopf algebra is called “quasitriangular” and its 
representations form a braided category (see Braided 
and Modular Tensor Categories). Even if R for a 
quasitriangular Hopf algebra is defined by a power 
series, the Ry w in finite-dimensional representa- 
tions are typically actual matrices. 

Of considerable interest is the special case when q 
is a primitive zth root of unity. In this case the 
quasitriangular Hopf algebra u,(sl;) has the above 
generators but the additional relations 


gud geq 
which render the algebra generated by e,f,g as 7?- 
dimensional. The algebra no longer has a matrix 
block decomposition (is not semisimple) and not all 
representations descend to it. For example, if n is 
odd, then only representations of dimension <n 
descend. Other than this, one has many of the 
features of a classical enveloping algebra now for 
this finite-dimensional object. There is evidence that 
such objects over C are intimately related to 
classical Lie algebras but over a finite field. 

Finally, there is a similar theory of U,(q) for all Lie 
algebras determined by symmetrizable Cartan matrices 
{a;;}, including affine ones. Here i,j € I an indexing set 
and aj; —2i - j/i-i € (0, —1, —2,...} fori Z j, where - 
is a symmetric bilinear form on the root lattice Z[I] 
generated by I with zz a positive even integer. To be 
precise, one should also fix a *root datum" in the form 
of an inclusion Z[I] C X of the root lattice into a choice 
of character lattice X and an inclusion Z[I] C Y of the 
coroot lattice (also labeled by I) into the cocharacter 
lattice Y (the dual of X). Here the evaluation pairing is 
required to restrict to (7,7) —aj; if i,7€I and i^ is i 
viewed in the cocharacter lattice Y. We let q; = q'"/* 
and require q? # 1 for all i (or one may consider q as an 
indeterminate). We have generators e;, f' for i€ I and 
invertible g, for a each generator of Y, and the relations 


gaei = q" eiga, ^ f'g,—qV)g,f' 


ii/2 -ii/2 
Oubli NM. 
iy di u q;! i 
1—aj 1 Be 
Y c» " (ei'ej(e;)) ^ — 0 
r=0 r qi 


for all ¿Æj and an identical set for the {f'}. The 
coalgebra and antipode are 


Ae; — ej @ gil? +1 @e; 
Af! = fie1+g."” of 
Ag,=2,@82, Elga) = 1, 


—i:i/2 


$g; = go. Se; — —eig;- sf! - 


e(e;) = «P - 0 
agp 


The q-Serre relations are those above involving the 
q-binomial coefficients, defined now using the 
symmetric q-integers (m),=(q"—q™)/(q—q"'). 
They have their true explanation as 


Ad,, -ajj (ej) — 


where Ad is a braided group adjoint action in the 
sense of the section *Braided groups and quantum 
planes." Notice that while the root generators are 
modeled on the Lie algebra, the Cartan generators 
are modeled on the torus of an algebraic group, 
which contains global information. Thus, the more 
precise form of U,(s/z) is the e, f, g form with the 
generator g=q’ as above, with Z[I] CX and 
Z|I|=Y. Meanwhile U;(psl;) has the square root 
of this as generator (what we called g’/? before) 
with Z[I]— X and Z[I] C Y where the strict inclu- 
sion has T—2 in the lattice Z. Note that, in the 
complex case, SL; has compact real form SU; while 
its quotient, PSL>, has compact real form SO3, so 
these are distinguished at the Hopf algebra level. In 
general, the root datum has an associated reductive 
algebraic group which is simply connected when 
Y = Z[I] and generated by its adjoint representation 
when X = Z[I]. The complexified character lattice is 
a sublattice of the more familiar Lie algebra weight 
lattice and labels representations that extend to the 
(algebraic) group. Langlands duality interchanges 
the roles of X, Y. These subtleties are lost when we 
work over formal power series with q— eV? and 
Lie-algebra-like Cartan generators. 

These objects are mathematically so interesting 
that some authors define *quantum groups" as 
nothing more than this particular extension of the 
theory of Lie algebras, Cartan matrices and root 
systems. Among the deepest theorems is the exis- 
tence of the Lusztig-Kashiwara canonical basis 
which is obtained from g=0 but valid also at 
q-—1 (ie., for classical enveloping algebras) and 
which has the remarkable property of inducing bases 
coherently across highest-weight representations. 
From a physicist's point of view, however, there 
are many other Hopf algebras rather more closely 
connected with actual quantization. Most often, the 
terms quantum group and Hopf algebra are used 
interchangeably. 

There is similarly a reduced version u,(q). The 
simplest of all possible cases, even simpler than 
ua (sl), is for what one could call us(1) with a single 
generator g and 


g'—-1, Ag=geag, g=1, B=g" 
1 2 "m 
Ra = 一 279 ^g" eg" 


a.b=0 
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where q is a primitive zth root of unity. The Hopf 
algebra is the same as the group algebra 
CZ, — C(Z,,) but the R is nontrivial. A representa- 
tion means a Z,,-graded space, that is, graded into 
degrees 0,1,...,2 — 1. The braiding matrices have 
the diagonal form Ry, w, —q^^ on components of 
degree a, b, respectively. The braided category 
generated in this case is the one where anyons live. 
From this point of view, us(g) generate the category 
where nonabelian anyons live. Here R,2 (in place of 
q "99 *) along with an additional e,» factor as 
above gives the quasitriangular structure of u,(s/2). 
The physical model here is the rational conformal 
field theory mentioned above with these anyons as 
particular bound states. There is a proposal to use 
them in the construction of quantum computers. 


q-Deformation Coordinate Algebras 


From the coordinate algebra point of view, the 
corresponding deformation to the one in the last 
section is the Hopf algebra C,[SL;] with noncom- 
muting generators and relations 


ca = qac, ba = qab 

db = qbd, dc = qcd 

bc = cb, da — ad = (q—q )bc 
ad —q ‘be=1 


The coalgebra has the same matrix form on the 
generators as for C[SL;] and the antipode and 
*-structure (for C,[SU»]) are 


(i-o 2)-6 3) 


Its duality pairing with U;(sl;) is afforded by the 
2 x 2 Pauli-matrix representation of the latter. The 
C,[SU2] Hopf -algebra may be completed to a 
C*-algebra. 

One similarly has C;,[G] for all semisimple Lie 
groups G and their various real forms. From an 
axiomatic point of view, such quantum groups are 
“coquasitriangular” in the sense that there is a map 
R:H & H — k such that 


>》 Rau ) & bay)a = 》 bua R.(a(2) & by) 
for all a, b € H and 

R(ab & c) = >》 R( (a & c) R(b 8 ca) 

R(a & bc) — >》 R( a) 8 c)R (aq & b) 


for all a,b,ce H. We also require that R is 
invertible in a certain sense. These are just the 
arrow reversal of the axioms of a quasitriangular 


structure. In general, for the deformation of a linear 
algebraic group we will have some 7 generators t/;, 
now taken to be noncommutative, and with a 
matrix form of coalgebra 
At, = t, G9 i. et! ; = "P 

For the compact real form we will have St'; — t/*;. 
Moreover, from the first of the above axioms we 
will have among the relations 


E ptt — dum ow 


where R'; t =R(E Q t^j) is a matrix REM, ®M, 
obeying the YBE. If we take only these quadratic 
relations, we have the “Faddier Reshetikhin Takhta- 
jan (FRT) bialgebra” A(R) and it can be shown (see 
Majid 1995) that R extends to a coquasitriangular 
structure R on it. However, in our case we also have 


R! = R( St; @ rj) 
RU = R(t; @ St’) 


where R — ((R2) !)" (t transposition in the second 
factor of M,,) is called the “second inverse" of R. With 
these additional matrices, one may define a q-determi- 
nant and antipode relations as well (Majid 1995). One 
may also generate a rigid braided monoidal category 
and reconstruct a Hopf algebra A(R) from it. In this 
way, the R-matrix plays a role similar to that of the 
structure constants of a Lie algebra and can in 
principle define the quantum group coordinate alge- 
bra. Such R-matrices have been classified in low 
dimension and include multiparameter and other 
deformations of classical group coordinate algebra as 
well as other nonstandard quantum groups. 

In the C;[G] examples it is not the coalgebra which 
is essentially deformed but the algebra. We already see 
this above on the generators but the coproduct of a 
product of generators may look different. Nonetheless, 
one can identify the vector space that the products 
generate with that of C|G] and at least informally with 
respect to a deformation parameter express the 
product as a power series in the undeformed product 
(a e-product deformation). For generic values, one still 
has a Peter-Weyl decomposition C,[G] = & (V & V*), 
where the sum is over irreducibles corepresentations, 
which can be identified with the classical representa- 
tions of the algebraic group. One can make the same 
decomposition for C[G] and identify the matrix blocks 
V & V* in order to find this e-product. Also, since this 
is a flat deformation, it follows that the commutator at 
lowest order defines a Poisson bracket on G, given by 


| | : 
(£6; i) = Put, - rus, 
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and this Poisson bracket is compatible with the group 
product G x G — G as a Poisson map (because the 
Hopf algebra coproduct was an algebra map). Here r 
is the first order part in the expansion of the 
R-matrix. A Lie group equipped with a Poisson 
bracket compatible in this way is called a *Poisson 
Lie group." On general functions its Poisson bivector 
is generated by the first order part r € 9 & q in the 
expansion of R in the q-deformed enveloping 
algebra. In place of the YBE obeyed by R, we have 
the *classical Yang-Baxter equations (CYBE)," 


[r12, 723] + [r12, 713] + [713,723] = 0 


In this way, one may characterize an “infinitesimal 
version” of U,(g) as (g,7,6) where 6:qg—g@q is 
the leading part of rA — A and makes the triple into 
a quasitriangular *Lie Bialgebra" (see Classical 
r-Matrices, Lie Bialgebras, and Poisson Lie Groups). 

Finally, returning to our example, when 4 is an 
nth root of unity, one has the q-Frobenius Hopf 
algebra homomorphism 


CISL] 4 C,[SL2] 


(4 A (4 Pis 
上 一 ， 
c d e. Qm 
that is, a classical copy sitting inside the quantum 


group. Quotienting by this means adding the 
relations 


atu ul, p, e ux 


which gives the finite-dimensional reduced quantum 
group Cz"[SL;]. Similarly for other Cz [G]. These 
reduced quantum groups provide finite noncommu- 
tative geometries having the geometric flavor of the 
classical geometry but where geometry and physics 
(such as electromagnetic gauge theory modes) are 
fully computable. 


Self-Dual Quantum Groups 


The arrow-reversibility of the axioms of a quantum 
group make it possible to search for self-dual 
quantum groups or for quantum groups which, if 
not self-dual, have a self-dual form. This leads to the 
bicrossproduct quantum groups coming from mod- 
els of quantum gravity (Majid 1988) (see Bicross- 
product Hopf Algebras and Noncommutative 
Spacetime). 

The context here is that of Figure 1 which shows 
how Hopf algebras relate to other objects and to 
duality in a representation-theoretic sense. Along the 
central axis, we have put self-dual categories or in 
physical terms categories admitting Fourier trans- 
form. This is clear for abelian Groups where the 


Quantum |. Group 
theory duals 


Monoidal Hopf Abelian 
categories algebras groups 
Riemannian Nonabelian 
geometry groups 


Figure 1 Role of Hopf algebras along the self-dual axis. 


dual G of an abelian group G is also an abelian 
group. Below the axis, we have nonabelian groups 
which we view as toy models of geometries with 
curvature. Every compact Lie group, for example, 
has an associated Killing metric. Above the axis, a 
nonabelian group dual G means to construct unitary 
representations etc., which we view as toy models of 
quantum theory. We have seen that Hopf algebras 
are another self-dual category and provide a frame- 
work in which both groups and group duals can be 
unified (see the section *Hopf algebras and first 
examples"). Thus, G can be viewed as a coordinate 
Hopf algebra C(G) or C[G] in the finite or Lie cases, 
and G as the dual Hopf algebra CG or U(q) as a 
definition of the coordinate algebra *C(G)." Note that 
G is not merely the set of representations, as these 
alone are not enough to reconstruct the group (e.g., 
both S! and SO; have the same set). We see that Hopf 
algebras are a microcosm for the unification of 
quantum theory and gravity. Hopf algebra duality 
interchanges the role of position and momentum on 
the one hand and of quantum and gravitational effects 
on the other. A self-dual Hopf algebra has both aspects 
unified and interchanged by the self-duality. 

One can also ask what the next most general self- 
dual category of objects is in which to look for more 
general unifications. One answer here is the category 
whose objects are themselves categories C equipped 
with a tensor product (a “monoidal category”) and a 
monoidal functor to a fixed monoidal category V. 
Motivated by the above, a theorem from the 1980s 
is that for any such C there is a dual C° of 
“representations in V” (Majid 1991a). The dotted 
arrows in Figure 1 indicate that this may be a setting 
for more ambitious models than those achieved by 
Hopf algebras alone. In fact, the C? construction was 
one of the ingredients going into the invention of 
2-categories a few years later. See also several 
articles on TQFT (such as Topological Quantum 
Field Theory: Overview; Axiomatic Approach to 
Topological Quantum Field Theory; Duality in 
Topological Quantum Field Theory). 
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The simplest self-dual quantum group is C[x] as 
the Hopf algebra of polynomial functions on a line 
with additive coproduct. This is dually paired with 
itself in the form of the enveloping algebra 
U(gl;) = C[p] with pairing 

(p", x") = (—i) Eman! 
and similarly for higher-dimensional flat space. In 
the case of C[x], a basis is x" and from the above the 
dual basis is (ip)" /n!. Hence the canonical element is 
exp —e"9? so that Hopf algebra Fourier transform 
on a suitable completion of these algebras reduces to 
usual Fourier transform. 

A more nontrivial example (Majid 1988) is given 
by the “Planck-scale Hopf algebra” C[x]b<C [p] 
which has algebra and coalgebra 


Ip, x] = ib(1-—e 1"), 
Ap=p®e *~+1@p, 
Sx = —x, 


The actual generator here should be ex rather than 
x for an algebraic treatment (otherwise one should 
allow power series or use C*-algebras). The dually 
paired Hopf algebra has the same form C[p] »« C [x |, 
with new parameters b'—1/b and ^4 —h and 
quantum group Fourier transform connects the 
two. More details and the general construction of 
Hopf algebras C[M] >< U(q) with dual U(m)r«C[G] 
are in the article on “bicrossproduct” Hopf algebras 
(see Bicrossproduct Hopf Algebras and Noncommu- 
tative Spacetime). These quantize particles in M 
moving under momentum Lie group G with Lie 
algebra q and vice versa. The states of one (in a 
C*-algebra context) lie in the algebra of observables of 
the other (*observable-state duality”). The data 
required are a matched pair of actions of (G, M) 
on each other. Such equations correspond locally to 
a factorization of a larger group GMM but 
typically have singularities and other features in 
keeping with a toy model of Einstein’s equations. 
There are, by the time of writing, many applica- 
tions of bicrossproducts beyond the original one, 
including a Poincaré quantum group for the RI 
mentioned in the section “Hopf algebras ani 
first examples," with links to Planck-scale physics, 
There is also a bicrossproduct quantum group 
C[G*]p< U(q) canonically associated to any simple 
Lie algebra g and related to T-duality. The classical 
data here are Lie bialgebras and solutions of the 
CYBE as in the section *q-Deformation coordinate 
algebras," however there is no known relation with 
the q-deformation Hopf algebras themselves. Finite 
group bicrossproducts are also interesting and 
examples (but not with both actions nontrivial) 
were already in the works of GI Kac in the 1960s. 


Ax=x@1+1@<x 
ex = ep = 0 
Sp = —pe™ 


These constructions also work when the groups 
above are themselves Hopf algebras. For example, 
any finite-dimensional Hopf algebra H has a 
“quantum double” D(H)=H ra H*°?P, where the 
double cross product pa is by mutual coadjoint 
actions. The cross-relations between the two sub- 


Hopf algebras are 
X anh 


S (bapa 
for heH and ae H*. The construction is due to 
Drinfeld (1987) while the ra form is due to the 
author. Moreover, D(H) is quasitriangular with 
R= exp, the canonical element used in the Fourier 
transform on H. Its representations consist of vector 
spaces where H acts and at the same time H*°? acts 
or (which makes sense when H is infinite dimen- 
sional) where H coacts, in a compatible way. Such 
objects are called *crossed modules" because when 
H=CG, one has exactly a linearization of the 
crossed G-sets of JC Whitehead. They are a special 
case of the C? construction mentioned above. 
Finally, one can also view the q-deformed linear 
spaces on which quantum groups such as U,(q) act 
as self-dual Hopf algebras under an additive 
coproduct. However, this needs to be as braided 
groups or Hopf algebras with braid statistics, see the 
next section. The simplest example here is the 
*braided line" B — C[x] developed not as above but 
as a self-dual Hopf algebra with q-statistics. Its 
“bosonization” gives a self-dual Hopf algebra 
U¿(b+) C U4(sb), and similarly for other U,(b.) C 
U,(q). Perhaps more surprisingly, the quantum 
groups U,(q) and C,[G] also both have canonical 
braided group versions (a process called *transmuta- 
tion") and as such they too are isomorphic. This 
isomorphism extends the linear isomorphism q — a* 
afforded by the Killing form of any semisimple Lie 
algebra. In physical terms, what this means is that 
there is in q-deformed geometry just one self-dual 
object B,(G) with two different scaling limits 


U(g) — B4(G) > CIG] 


1)) 42) a(2) = 2);4(2)) 


as q— 1, and the structure of which underlies the 
deeper structure of U,(q) and C,[G] as well. 


Braided Groups and Quantum Planes 


A super quantum group or super-Hopf algebra is 
not a quantum group or Hopf algebra since the key 
homomorphism property of A: H — H & H is mod- 
ified: one must use in the target HQH the Z5-graded 
or super tensor product of super algebras. Here, 


(a & b)(c & d) 2 (-1 1) ld ac @ bd 
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for elements of degree |b|, |c|. Super quantum groups 
U4(gl,;,) etc., have been constructed and have an 
analogous theory to the bosonic versions above. 
Super spaces in physics are associated to differential 
forms and in the same way a bicovariant exterior 
algebra on a quantum group H is generally a super 
quantum group. Here the exterior algebra is 
generated on by 1-forms and the coproduct on 1- 
forms is 


A — Aj + Ag 


Here Aj are the coactions of H on 1-forms 
induced by the left and right coaction of H on itself. 

For a true understanding of quantum groups one 
must, however, go beyond such objects to “braided 
groups" or Hopf algebras with braid statistics (see 
Majid 1995). This theory was introduced by the 
author in the early 1990s as a more systematic 
method for q-deformation of structures in physics 
based on g-group covariance. We have seen that a 
quasitriangular quantum group, or any Hopf alge- 
bra through its double, generates a braided category 
with the flip map 7 replaced by a braiding Vy, w 
between any two representations. Anything which is 
covariant under the quantum group means by 
definition that it lives in the braided category. 
Working with such “braided algebras” is similar to 
working with superalgebras except that one should 
use W in place of the graded transposition in any 
algebraic construction. In particular, two braided 
algebras have a natural “braided tensor product” 
also in the category. In concrete terms, 


(a & b)(c& d) =aWV(b@c)d 


Then a Hopf algebra in the braided category or 
braided group is B, an algebra in the category along 
with a coalgebra and antipode, where A: B — B&B 
is an algebra homomorphism (see Braided and 
Modular Tensor Categories). 

Next, we have mentioned in the section 
“g-Deformation enveloping algebras" that q-alge- 
bras generate topological invariants, but we now 
turn this on its head and use braid diagrams to do q- 
algebra. We write all operations as flowing down 
the page, any transpositions in the algebraic con- 
struction are expressed as a braid crossing Y =X or 
its inverse by the reversed braid crossing, and any 
other operations as nodes. Thus, a product is 
denoted Y and a coproduct A. Algebraic informa- 
tion “flows” along these “wires” much like the way 
that information flows along the wiring in a 
computer, except that under- and over-crossings 
represent distinct nontrivial operators. (In fact, one 
may formulate topological quantum computers 
exactly in this way.) In this notation, tensor 


products are denoted by juxtaposition and the trivial 
object in the category is omitted. In particular, one 
has the axioms and all general theorems of Hopf 
algebras at this diagrammatic level. For example, the 
adjoint action of any braided group B on itself is 
(see Majid 1995) 


In any concrete example, such diagrams turn into 
R-matrix formulas where V — 7R as explained in the 
section *q-Deformation enveloping algebras." 

A basic example of a braided group is the braided 
q-plane C; with generators x,y and relations 
yx = qxy. Its coproduct is the additive one Ax =x & 
1+ 1@-x (and similarly for y) reflecting addition in 
the plane, but this is extended to products as a 
braided group with braiding q!/^R,/? 1/? in terms of 
the R-matrix in the section *q-Deformation envel- 
oping algebras." The extra factor here means that 
C? lives in the braided category of representations of 
U,(gh)- Us(sh) (ie., with an additional central 
U,(1) generator to provide the q'/*). More precisely, 
the category is that of corepresentations of 
C,|GL2]=C,[SL2]. The coaction in this case is 


Ax y) - e» e(t 4) 


where the additional central generator is encoded in 
the g determinant (which is no longer set equal to 
1). Notice that g'/*R, /2,1/2 has eigenvalues q, 一 =g” 
(one says that it is g-Hecke). Another braided poni. 
associated now to the second eigenvalue is C ^ 
with generators £,] and relations 9y£— —q Af, 
¿=n? —0. It is the quadratic algebra dual of C 
(Manin 1988). 

One has natural braided linear spaces for the whole 
family C,[G], on which the latter coact after central 
extension. The general construction is as follows. It V 
is an object in a braided category (e.g., the funda- 
mental representation of a quantum group), let T(V) 
be the tensor algebra generated by a basis {e;} of V 
with no relations and the additive braided coproduct 
as above. Assume that V has a dual V* in the 
category, and similarly form T(V*) with dual basis 
generators {f'}. These two braided groups will be 
dually paired by extending the evaluation map to 
products, which takes the form of *braided integers" 
(see Majid 1995) 


(f* fe, 
n, V 


e) = Asso, V] 


= id + V4» + V45V53 十 … + Viz V, 4 
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We now quotient by the kernels of this pairing to 
obtain B(V),B(V*) as two nondegenerately paired 
braided groups. This quotient generates all the rela- 
tions, which are very often but not necessarily 
quadratic (in practice, one typically imposes only the 
quadratic relations to have braided groups with a 
possibly degenerate pairing). The construction is due to 
the author. Moreover, we can define partial derivatives 
on these braided groups by Aa — 1 Qa 4- e; & Ó'a 4- --- 
for any a in the algebra, that is, as an infinitesimal 
generator of translations under the braided group law; 
similarly exp, indefinite and Gaussian integration, 
Fourier transform, etc. The simplest example here is 
B — C[x] viewed not as a usual Hopf algebra but as a 
braided group in the category of Z-graded spaces with 
W(x ® x) =qx Qx. Also in this example the braided 
addition law on C[x] is 


n 
a= y^ x" @ x^" 
q 


m=O 


defined by [m],, and the partial derivative defined 
by it is the Jackson (1908) q-derivative 


f (x) — f (qx) 
x(1— q) 


while Ae,(x)=e,(x) @eg(x) if we allow power 
series. Such objects occur in the theory of g-special 
functions (see q-Special Functions). 

Among deeper theorems (see Majid 1995, 2002), 
there is a triangular decomposition 


U,(g) = Us(n-) >aT >< U,(n,) 


where U,(z.,) is a braided group and U,(z..) is dually 
paired to its opposite. T denotes the torus generators 
{ga} in the section *q-Deformation enveloping alge- 
bras." More generally, if qo C q is a principal 
embedding of Lie algebras (given by an inclusion of 
Dynkin diagrams), then U,(q) = B* »«U,(ao) >< BP 
for some additive braided group of additional root 
generators and its dual. The general construction 
B* >< H< B?? here is “double bosonization” which 
associates to dual braided groups B,B* in the 
category of representations of some quasitriangular 
Hopf algebra H, a new quasitriangular Hopf algebra. 
The simplest example B = C[x] lives in the category of 
representations of T=U,(1) in an algebraic form. 
The dual is another braided line C[p] and 
C[p] >< U;(1) FX C[x] is a version of U,(slz). In this 
way, the braided line C[x] is at the root of all 
q-deformation quantum groups. 

An earlier theorem is that for any braided group B 
covariant under a (co)quasitriangular H, we have its 
‘bosonization’ B 24H. There is a similar *biproduct" 
if B lives in the category of crossed modules for any 


Of (x) = 


Hopf algebra H. These have been extensively applied 
in physics notably in the construction of inhomoge- 
neous quantum groups. Similar to C^ (but as a 
*-algebra), there is a natural self-dual q-Minkowski 
space BeEX which is covariant under U,(soi,3), 
and its bosonization is the q-Poincaré plus dilations 
group R19? sq U,(s01,3). It is not possible to avoid the 
dilation Tiere. The double-bosonization extends this 
to the q-conformal group U,(so2,4). The braided 
adjoint action becomes the action of conformal 
translations on RS. The construction of q-propaga- 
tors and q-deformed physics on such q-Minkowski 
space was achieved in the mid 1990s as one of the 
main successes of the theory of braided groups. 

This RZ^ can be given also as a matrix of 
generators, relations, -structure and, a second 
braided coproduct: 


Ba — q^of, 7Ya=q oY, 
By — y8 4- (1 — 4 ^)o(6 — o) 
68 = 86 -- (1 -g jap 

76 = 6y+ (1 4 ^)ya 


(«C ole 8) 
C 2-69 69-6) 


This is in addition to the additive coproduct above. 
It corresponds to the point of view of Minkowski 
space as Hermitian 2 x 2 matrices. Note that A is 
not a *-algebra map in the usual sense and indeed 
Hermitian matrices are not a group under multi- 
plication, but this does form a natural braided *- 
bialgebra. If we quotient by the braided determinant 
relation aô — g7y3=1, we have the unit hyperbo- 
loid in Re which turns out to be the braided group 
B¿[SU2] mentioned at the end of the previous 
section (as obtained canonically from C,[SU2]). We 
now have a braided antipode 


s(? E es aee mas ud 
y 6 -q^y a 


This was the first nontrivial example of a braided 
group (Majid 1991b) and we see that it has two 
q— 1 limits 


6a = ad 


2 


U(suz) —— B,|SU2] 一 C[Hyperboloid c R**] 


Because most constructions in physics can be 
uniformly deformed by such methods (including 
the totally g-antisymmetric tensor), one finds that q 
provides a new regulator in which infinities in 
quantum field theory can be in principle be encoded 
as poles at g=1. That transmutation from the 
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quantum group to its braided version unifies unitary 
nonabelian symmetries with pseudo-Riemannian 
geometry is another deeper aspect of relevance to 
physics. In addition, g-constructions have their 
original role in quantum integrable systems, at g a 
root of unity and for infinite-dimensional (affine) 
Lie algebra deformations. 


Quasi-Hopf Algebras 


Although the braided category of representations of 
a quantum group has a trivial “associator” 
Py w.z:(V® W)@2Z-Ve(W®@Z) between any 
three objects, a general braided category and the 
diagrammatic methods of “braided algebra” in the 
last section do not require this (one simply translates 
diagrams into algebra by inserting ® as needed). A 
more general object that generates such categories as 
its representations is a *quasi-Hopf algebra.” This is 
a generalization of Hopf algebras in which the 
coproduct A: H — H & H is not necessarily coasso- 
ciative. Instead, 


(id & A)A = o((A &id)A jo~! 
(id & e @id)d = 
234(id & A 8 id)($)6123 

= (id^ & A)(9)(A @ id^)(o) 


for some invertible element ó € H $ H&H. The 
numbers denote the position in the tensor product 
and one says that @ is a 3-cocycle. The axioms for 
the antipode and quasitriangular structure R are 
also modified. The tensor product of representations 
is given as usual by A, and the braiding and 
associator by the actions of R and vw. 

This notion, due to Drinfeld (1990), arises when 
one wishes to write down the quantum groups U,(q) 
more explicitly as built on the algebras U(q) (recall 
that they are isomorphic over formal power series). 
Thus, for each semisimple q there is a natural 
(quasitriangular) quasi-Hopf algebra (U(a), o, R) 
where U(q) has the usual Hopf algebra structure, 
R. is an exponential of the split Casimir (or inverse 
Killing form) in q$q and @ is constructed as a 
solution of the Knizhnik-Zamolodchikov equations 
coming out of conformal field theory. This is not 
U,(q) but it has an equivalent braided category of 
representations. Thus, there is an element F € U(q)? 
(extended over formal power series) such that 


Ar = F(A yp, Re = FARE! 
op = Fi( ^ @ id)(F)ó(id @ A)(F !)F;] = 1 


recovers U,(q) as a quasitriangular Hopf algebra 
built directly on the algebra U(q). The conjugation 


operations here (and a similar process regarding the 
antipode) are a “Drinfeld twist” of a quasi-Hopf 
algebra, and such twisting by any invertible F such 
that 


(e @id)F = (id & e))F = 1 


(a cochain) does not change the representation 
category up to equivalence. In the present case, the 
twist transforms 由 into óp-]1, that is, into an 
ordinary Hopf algebra isomorphic over formal 
power series to U,(q). Note that in rational 
conformal field theory the tensor product of 
representations appears as a finite-dimensional 
commutative associative algebra (the Verlinde alge- 
bra) with integer structure constants N”, (this comes 
from the operator-product expansion of primary 
fields in the theory). This is because one has more 
precisely a truncated representation category corre- 
sponding to g a root of unity, and because we are 
identifying equivalent representations (so N", are 
the multiplicity in the decomposition of a tensor 
product of two representations). However, if one 
wants to know the tensor product decomposition 
more fully, not just its isomorphism class, this is 
given in a choice of bases by recoupling matrices. 
Computation in terms of these shows that the actual 
tensor product is neither commutative nor associa- 
tive, but of the form above at least in the case of the 
WZW model. 

Hopf algebra theory typically extends to the 
quasi-Hopf case. For example, given a quasi-Hopf 
algebra H there is a quantum double D(H) at least 
in the finite-dimensional case, due to the author. An 
example is to take H = C(G) and ¢ a 3-cocycle on G 
in the usual sense 


p(y, z, w)o(x, yz, w)ó(x, y, x) 
= (x, y, ZW) P(X, Z, w) 
on elements of G and ó(x, 1, y) 2 1. Then (C(G), 4) 
can be viewed as a quasi-Hopf algebra. Its double 
D*(G) is generated by C(G) as a sub-quasi-Hopf 
algebra and by elements of G with 


ayy ye ) x Y, X x)(s), Ós X = XO esi 


P(x,x lax, x! bx)d(a, b, x) 
one e ó(a,x,x-!bx) 


ab=s 


x Xóa Q xdp 


in terms of a basis {6,} of C(G), the product of G on 
the right, and 


_ bxy y "x sxy)o(s x, y) 


x(x, y)(s) plx, x tsx, y) 
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a 2-cocycle on G with values in C(G) (the algebra is 
a cocycle semidirect product). There is a quasitrian- 
gular structure R= 5°>6,@x. This quasi-Hopf 
algebra first appeared in discrete topological quan- 
tum field theory related to orbifolds in the work of 
Dijkgraaf, Pasquier, and Roche. 

There are further generalizations in the same spirit 
and which are linked to conformal field theories of 
more general type; for example, weak (quasi-) Hopf 
algebras in which Al 41@1 but is a projector. 
These have been related to quantum groupoids. 

Finally, we mention some applications of twisting 
outside of the original context. First of all, we are not 
limited to starting with U(q): starting with any Hopf 
algebra or quasi-Hopf algebra H we can similarly twist 
it to another one Hp with the same algebra as H and 
Ar,Rr,¢r given by conjugation as above. The 
representation category remains unchanged up to 
equivalence, so in some sense the twisted object is 
equivalent. Moreover, if we start with a Hopf algebra 
H and ask F to be a 2-cocycle in the sense 


F(A & id)(F)(id & A)(F-')F5; = 1 


then Hp will remain a Hopf algebra. It has 
conjugated antipode (see Majid 1995) 


Sp(a) = U(Sa)U , U = «(id & S)(F) 


Many Hopf algebras are twists of more standard 
ones, for example, the multiparameter quantum 
groups tend to be twists of the standard U,(q). 
Likewise, “triangular” Hopf algebras (where 
R21R=1) tend to be twists of classical group or 
enveloping algebras. 

A second application of twists is an approach to 
quantization. Although it can be applied to H itself, this 
is more interesting if we think of H as a background 
quantum group and ask to quantize objects covariant 
under H. For the sake of discussion, we start with H 
an ordinary Hopf algebra. We twist this to Hp and 
denote by 7 the equivalence functor from representa- 
tions of H to representations of Hp. This functor acts 
as the identity on all objects and all morphisms, but 
comes with nontrivial isomorphisms cy, w:7(V) & 
T(W)— T(V & W) for any two objects, compatible 
with bracketting (see Majid 1995). Given any algebraic 
construction covariant under H, we simply apply the 
functor 7 to all aspects of the construction and obtain 
an equivalent H-covariant construction. As an exam- 
ple, if A is an H-covariant algebra, then applying 7 to 
its product we have 7(-):7(A& A) 5 7(A). Using 
CA. A We Obtain a map 


e: T(A) @T(A) ^ T(A) 
aeb= (F^! > (a & b)) 


in terms of the product in A. Thus, we have a new 
algebra Ar built on the same vector space as A but 
with a modified e product. This is called a 
“covariant twist" of an algebra and should not be 
confused with the Drinfeld twist above. It is due to 
the author in the early 1990s. If F is a 2-cocycle, 
then Arp remains associative. The transmutation 
construction mentioned in the section “Self-dual 
quantum groups" or the passage from Ri to RI? are 
examples in quantum group theory. Other examples 
include the standard Moyal product on R", also 
called noncommutative spacetime [x,, Xv] —10,, by 
string theorists (see Bicrossproduct Hopf Algebras 
and Noncommutative Spacetime). 

If we do not demand that F is a cocycle, then the 
algebra Ar is still associative but in the target 
category, which means 


(ae b)ec-— (e(e))bAAA((a & b) &c) 


Such objects are called *quasialgebras." It may still 
be that Ba 4.4 happens to be trivial (Or happens to 
act trivially) so that Ap remains associative. This 
turns out frequently to be the case and many 
quantizations in physics, including C,[G] but not 
limited to q-examples, can be obtained in this way. 
It means that although they are associative there is a 
hidden nonassociativity which can surface in other 
constructions involving ^. The physical application 
here is with H = U(q) a classical enveloping algebra, 
A functions on a classical manifold on which q acts, 
and a cochain F. In general the resulting quasialge- 
bra will not be associative but rather a quantization 
of a *quasi-Poisson manifold" obeying 


{a, (b, c) ) + cylic = 2n(a & b &c) 


Here ñ% is the trivector field for the action of the 
lowest order part of or and the (quasi)Poisson 
bivector is the leading-order part of FF. As 
mentioned, there are many cases where ñ% (and the 
action of the rest of or) happens to be trivial. 

Finally, let us give a discrete example using such 
quantum group methods. We consider H — C(G) 
and Fe C(G x G) a cochain. Twisting by this gives 
Hyg — (C(G), dp) a quasi-Hopf algebra where 


F(y, z)F((x, yz) 
F(xy, z)F(x. y) 
We take A — CG the group algebra. The action of 


C(G) on it is the diagonal one. The modified algebra 
Ar therefore has product 


Ór(x, y,z) = 


xey-F'(x,y)xy 


in terms of the product in G, and will be a 
quasialgebra if F is not a cocycle. For example, let 
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G=(Z»)> which we write additively (so elements 
are 3-vectors with values in Z5) and take 


i<j xXiyjt-y1X2xX3-X1y2X3ad-x1X2y3 


F(x,y) = (—1) 


PE(X, y, z) = (—1)*9 x z) 


Moreover, Ap =O, the octonions (Albuquerque and 
Majid 1999). So these are a nonassociative quanti- 
zation of the classical discrete space (Z5). We see 
that they are in fact associative up to sign and with 
sign +1 when the corresponding 3-vectors are 
linearly independent. 


Noncommutative Geometry 


In this article, we have frequently encountered the 
view of quantum groups and other noncommutative 
algebras as by definition the coordinate algebras on 
“noncommutative spaces." However, the “quantum 
groups approach" to such noncommutative geome- 
try that emerges has a somewhat different flavor 
from other approaches, as we discuss now. 

In fact, the problem of geometry at such a level 
was mentioned already by Dirac in the 1920s and 
led to theorems of Gelfand and Naimark in the 
1940s and 1950s whereby a noncommutative 
C*-algebra should be viewed as a noncommutative 
topological space, and of Serre and Swan in the 
1960s whereby a finitely generated projective 
module should be viewed as a vector bundle. 
Algebraic K-theory led to further refinement of this 
picture and particularly, in the 1980s, to A Connes’ 
formulation in terms of cyclic cohomology and 
“spectral triples” (see Noncommutative Geometry 
and the Standard Model; Noncommutative Tori, 
Yang-Mills and String Theory; Quantum Hall 
Effect; Hopf Algebra Structure of Renormalizable 
Quantum Field Theory; Path Integrals in Noncom- 
mutative Geometry). The quantum groups approach 
is less axiomatic, and consists of at least three 
disparate elements. 

The first layer of the quantum groups approach is 
the theory of g-deformed groups and q-spaces on 
which they act, using braided category methods 
(such as braided linear spaces). The braided group 
additive law leads to partial derivatives and these 
define q-exterior algebras etc. This programme 
covered during the 1990s most of what is needed 
to q-deform physics in flat space at an algebraic 
level. Formulas here tend to be complex but 
controlled by R-matrices, and the correct R-matrix 
formulas can be found systematically by working 


with braided algebra as explained in the section 
“Braided groups and quantum planes.” From a 
slightly different side, g-representation theory and 
the further theory of q-homogeneous spaces is 
intimately tied to a theory of q-special functions 
(such as the q-exponential function in the section 
*q-Deformation enveloping algebras") of interest in 
their own right (see q-Special Functions). The use of 
*-algebras in some cases completable to C*-algebras 
is a point of contact with other approaches to 
noncommutative geometry but problems emerge 
when one considers the braiding. As a result, the 
natural q-Poincaré (plus dilation) quantum group is 
not even a Hopf »-algebra. Briefly, once one starts 
to braid the constructions, one may need to 
represent them with braided (not usual) Hilbert 
spaces and q-analysis. 

The second layer of the quantum groups approach 
is based on “differential calculus” as a specification 
of an exterior algebra of differential forms or 
differential graded algebra (DGA). In general this is 
a wild problem but, as in classical geometry, the 
requirement of a quantum group covariance greatly 
narrows the possible calculi, although no longer to 
the point of uniqueness. The first examples of 
covariant calculi on the quantum group C,[SU2| 
were found by Woronowicz (1989). The bicovariant 
one of these was cast in R-matrix form by Jurco 
while the first actual classification results on the 
moduli of irreducible calculi were obtained by the 
author (the bicovariant ones are essentially in 
correspondence with irreducible representations V, 
with left-invariant differentials forming a braided 
group of the form B(V @ V*)). Probably the most 
interesting feature of this theory is that for all C;[G] 
the bicovariant g-calculus cannot be of classical 
dimensions. For example, for Cz[SU2] the smallest 
nontrivial calculus is four dimensional. The *extra 
dimension" is a biinvariant 1-form 0 which has the 
property that [6,2] —da for all aeC,[SU2] and 
which can be viewed as a spontaneously generated 
time (see Bicrossproduct Hopf Algebras and Non- 
commutative Spacetime). Quantum group methods 
also provide DGAs on finite groups, this time 
classified in the bicovariant case by nontrivial 
conjugacy classes. These therefore provide Lie 
structures on finite groups. One can go much further 
and define quantum principal bundles (with quan- 
tum groups as fiber) over general noncommutative 
algebras (Brzezinski and Majid 1993), associated 
bundles, frame bundles, and Riemannian geometry 
of the algebra (see Quantum Group Differentials, 
Bundles and Gauge Theory). 

Again q-deformation provides key examples but 
the theory may then be applied to other situations. 
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For example, the permutation group $3 has a natural 
connected calculus with dimensions 1: 3:4:3:1 
(in other words the space has six points but each 
point has the local structure of a 4-manifold in some 
sense). It turns out to have a unique Levi-Civita type 
connection V for its invariant metric, with constant 
curvature. The use of DGAs here is in common with 
other approaches (e.g., Connes 1994) and indeed 
bundles associated to quantum group principal 
bundles and suitable connections can be shown to 
be projective modules. The approaches diverge at 
the level of spectral triples, however, and the 
examples of “Dirac operators" that emerge from 
quantum group methods do not usually obey the 
required axioms. 

A third established layer of the quantum groups 
approach is to trade some of the noncommutativity 
for nonassociativity, as in the dual version of 
Drinfeld's construction, that is, C,[G] in terms of 
classical C[G] as a (co)quasi-Hopf algebra. The 
general approach here is a quantization functor 7 
which provides all constructions but which will 
typically bring out the underlying nonassociative 
geometry even when the noncommutative covariant 
algebras of interest is associative. For example, 
applying the functor to the classical exterior algebra 
Q(G) gives a bicovariant O(C;[G]) of classical 
dimensions but with nonassociative products (it is 
a supercoquasi-Hopf algebra). As before, one may 
then apply these quantum group methods to other 
algebras not related to q-deformation. 

Beyond these are many recent developments, some 
of which are covered in other articles. Probably one 
of the most interesting frontiers, at the time of 
writing, is the exploration of links of both quantum 
groups and noncommutative geometry to number 
theory. 


See also: Affine Quantum Groups; Axiomatic Approach 
to Topological Quantum Field Theory; Bicrossproduct 
Hopf Algebras and Noncommutative Spacetime; Braided 
and Modular Tensor Categories; Classical r-Matrices, Lie 
Bialgebras, and Poisson Lie Groups; Duality in 
Topological Quantum Field Theory; Eight Vertex and 
Hard Hexagon Models; Hopf Algebra Structure of 
Renormalizable Quantum Field Theory; The Jones 
Polynomial; Noncommutative Geometry and the 
Standard Model; Noncommutative Tori, Yang-Mills and 
String Theory; Path Integrals in Noncommutative 


Geometry; q-Special Functions; Quantum Group 
Differentials, Bundles and Gauge Theory; Quantum Hall 
Effect; Symmetries in Quantum Field Theory of Lower 
Spacetime Dimensions; Topological Quantum Field 
Theory: Overview; von Neumann Algebras: Subfactor 
Theory; Yang- Baxter Equations. 
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From Classical Mechanics 
to Quantum Mechanics 


The initial goal of semiclassical mechanics was to 
explore the correspondence principle, due to N Bohr 
in 1923, which states that one should recover the 
classical mechanics from the quantum mechanics as 
the Planck constant h tends to zero. So we start with 
a very brief presentation of these two theories. 


Classical Mechanics 


We start (with the Hamiltonian formalism) from a C* 
function p on R” : (x, £) — p(x, £), which describes the 
motion of the system under consideration and is called 
the Hamiltonian. The variable x corresponds, in the 
simplest case, to the position and £ to the momentum of 
one particle. The evolution is then described, starting 
at time 0 of a given point (y,7), by the so-called 
Hamiltonian equations 


d 
i = (0/06) (x(t), El), for j— 1... n " 
i —(Op/Ox;)(x(t),€(t)), forj=1,...,” 


The classical trajectories are then defined as the 
integral curves of a vector field defined on R” called 
the Hamiltonian vector field associated with p 
and defined by H,=(0p/0&, —Op/Ox). All these 
definitions are more generally relevant in the 
framework of symplectic geometry on a symplectic 
manifold M (but we choose, for simplicity, to explain 
the theory on R?"), which can be seen as the cotangent 
vector bundle T*R", and is the *local" model of the 
general situation. This space is equipped naturally 
with a symplectic structure defined by giving at each 
point a nondegenerate 2-form, which is here 
g :— J; dé; ^ dx;. This 2-form permits us to associate 
canonically to a 1-form on T*R7 a vector field on 
T*R¥. In this correspondence, if p is a function on 
T*R%¥, Hp is associated with the differential dp. 

In this article, we consider the example of the 
Hamiltonian p(x,£)—£^-- V(x), also called the 
Schródinger Hamiltonian, as the guiding example. 
More specifically, the case of the harmonic eripe 
tor, where V is given by V(x)— »57  ujx; ? (with 
Li; > 0), is the most significant, which is the be 
approximation of a potential near its minimum, 
when nondegenerate. 


In the framework of the classical mechanics, the 
main questions could be: 


Are the trajectories bounded? 

Are there periodic trajectories? 

Is one trajectory dense in its energy surface? 
Is the energy surface compact? 


The solution of these questions could be very difficult. 
Let us just mention the trivial fact that, if p^! (A) is 
compact for some A, then, by the conservation of 
energy law 


p(x(t), y(t)) = p(y, n) [2] 


the whole trajectory starting of one point (y,7) 
remains in the bounded set (p^! (p(y, 7))} in R”. This 
is in particular the case for the harmonic oscillator. 


Quantum Mechanics 


The quantum theory was born dynamics-wise around 
1920. It is structurally related to the classical 
mechanics in a way that we shall describe very briefly. 
In quantum mechanics, our basic object will be a 
(possibly nonbounded) self-adjoint operator defined 
on a dense subspace of a Hilbert space H. In order to 
simplify the presentation, we shall always take 
#H=I7(R"). 

This operator can be associated with p by using 
the techniques of quantization. We choose here to 
present a procedure, called the Weyl quantization 
procedure (which was already known in 1928), 
which under suitable assumptions on p and its 
derivatives, will be defined for u € S(R") by 


p" (x, bDy,b)u 


—(2nb) J fo (x — y): €) 
x p( 53. but) dyde 3 


The operator p" (x, bD,., b) is called an b-pseudodiffer- 
ential operator of Weyl symbol p. One can also write 
Op; (p) in order to emphasize that it is the operator 
associated to p by the Weyl quantization. Here b is a 
parameter which plays the role of the Planck constant. 

Of course, one has to give a sense to these integrals 
and this is the object of the theory of the oscillatory 
integrals. If p — 1, we observe that, by Plancherel's 


formula, 
= Qxby" f f exp (5-9) 
x 


u(y) dy dé [4] 
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the associated operator is nothing but the identity 
operator. A way to rewrite any /-differential operator 
2 /ajem 4a(x)(PDx)" as an h-pseudodifferential opera- 
tor is to apply it on both sides to [4]. In particular, we 
observe that if the symbol is p(x, £) = £? + V(x), then 
the operator associated with p by the h-Weyl 
quantization is the Schrödinger operator —5^A + V. 
Other interesting examples appear naturally in solid 
state physics. Let us, for example, mention the Harper 
operator H (or almost-Mathieu; see Helffer and 
Sjostrand (1989) and references therein), whose 
symbol is the map (x,£) — cos£ + cosx, and which 
can also be defined, for u € L^(R), by 


(Hu)(x) =} (u(x +h) + u(x — b)) + cosx u(x) 


We shall later recall how to relate the properties 
of p and those of the associated operator. More 
precisely, we shall describe under which conditions 
on p the operator p"(x,bD,;b) is semibounded, 
symmetric, essentially self-adjoint, compact, with 
compact resolvent, trace class, Hilbert-Schmidt 
(see Robert (1987) for an extensive presentation). 
But before looking at a more general situation, let 
us consider the case of the Schródinger operator 
S, — —b*A^--V(x) If V is (say, continuous) 
bounded from below, Sp, which is a priori defined 
on S(R") as a differential operator, admits a unique 
self-adjoint extension on L?^(R"). We are first 
interested in the nature of the spectrum. If 
V(x)— +00 as |x| —oo, one can show that S+, 
more precisely its self-adjoint realization, has 
compact resolvent and its spectrum consists of a 
sequence of eigenvalues tending to oo. We are next 
interested in the asymptotic behavior of these 
eigenvalues. 

In the case of the harmonic operator, corresponding 
to the potential 


V(x) 2 M ujx? (with py > 0) 
j=1 


the criterion of compact resolvent is satisfied and the 
spectrum is described as the set of 


Aa(h) = 5 ^ Vij (205 + 1)b 
j=1 


for o, € M”. 

In this case we also have a complete description 
of the normalized associated eigenfunctions which 
are constructed recursively starting from the first 
eigenfunction corresponding to Ao(5) =}; Vib: 


n ^ 1/4 
ven = (TEC) Jee(- X vent) [5] 


j=l 


The eigenfunction wo is strictly positive and decays 
exponentially. Moreover (and here we enter in the 
semiclassical world), the local decay in a fixed closed 
set avoiding {0} (which is measured by its L*-norm) is 
exponentially small as h—0. In particular, this says 
that the eigenfunction lives asymptotically in the set 
(V(x) € A(b)). This last set can also be understood as 
the projection by the map (x, £) ^x of the energy 
surface, which is classically attached to the eigenvalue 
Alh), that is, ((x,£) € R” |p(x,£) = A(b)). This is a 
typical semiclassical statement, which will be true in 
full generality. 


From Quantum Mechanics to Classical 
Mechanics: Semiclassical Mechanics 


Before describing the mathematical tools involved in 
the exploration of the correspondence principle, let 
us describe a few results which are typical in the 
semiclassical context. They concern Weyl's asymp- 
totics and the localization of the eigenfunctions. 


Weyl's asymptotics We start with the case of the 
Schrödinger operator $5, but we emphasize that the 
h-pseudodifferential techniques are not limited to 
this situation. 

We assume that V is a C*-function on R” which 
is semibounded and satisfies 


inf V < lim V(x) 
|x| +00 
The Weyl theorem (which is a basic theorem in 


spectral theory) implies that the essential spectrum is 
contained in 


lim V(x), tool 


lx] oc 


It is also clear that the spectrum is contained in 
[inf V, 十 cc]. In the interval 


[= 


inf V, lim veo] 


|x| 一 ce 


the spectrum is discrete, that is, it has only isolated 
eigenvalues with finite multiplicity. For any E in J, it is 
consequently interesting to look at the counting func- 
tion N;(E) of the eigenvalues contained in [inf V, E], 


N,(E)— 8(A;(b); (b) € Ej [6] 
The main semiclassical result is then 


Theorem 1 With the previous assumptions, we have: 


lim "Nj (E) = (2) (E — V(x))"? dx 


J V(x)<E 


The main term in the expansion of N;,(E), which 
will be denoted by 


W,(E):— (2xb) " J, gt — V(x))"? dx 


is called the Weyl term. It has an analog for the 
analysis of the counting function for Laplacians on 
compact manifolds (see Quantum Ergodicity and 
Mixing of Eigenfunctions and references therein), but 
let us emphasize that here E is fixed and that one 
looks at the asymptotics as h — 0. In the other case, h 
is fixed and one looks at the asymptotics as E — +00 
(note that on a compact manifold and for the 
Laplacian, the formula N,(E)— Ni(E/b?) permits 
switching between these cases). 

Although this formula is rather old (first as a 
folk theorem), many efforts have been made by 
mathematicians for analyzing the remainder (see 
Robert (1987), Ivrii (1998) and references therein) 
N,(E) — W,(E), whose behavior is again related to 
classical analysis. When E is not a critical value of 
V, b"U(N,QE)— W,(E) can be shown to be 
bounded but it appears to be o(1) if the measure 
of the periodic points for the flow is O (see Ivrii 
(1998)). 

Beyond the analysis of the counting function, 
one is also interested (e.g., in questions concerning 
the ground-state energy of an atom with a large 
number of particles, N, satisfying the Pauli exclu- 
sion principle (see Stability of Matter)) in other 
quantities like the Riesz means, which are defined, 
for a given s 0, by 


=) (E-AXX 
j 


The case s — 0 corresponds to the counting function. 
It is then natural to ask for the asymptotic behavior 
as b — 0 of these functions. 

We have, for example, the following result 
(Helffer-Robert, Ivrii-Sigal, and Ivrii; see Robert 
(1987) and Ivrii (1998)), which is written here in a 
more Hamiltonian version, when E is not a critical 
value of V, 


Nj(E) = Qxb) " | | Coro OY de a) 


a O(pint+s2)) 


with pe(x, €)=€ + V(x) — E 


Uncertainty principle and Weyl term The Weyl 
term can be heuristically understood in the follow- 


ing way. According to the uncertainty principle, a 
“quantum” particle should occupy at least a volume 
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of order b" in the phase space with the measure 
dxd€ (proportional to (57 , d& A dx;)”). This 


guess is a consequence of the inequality 


; luj? <( [ s-ata) 
(Ileto) 


expressing the noncommutation of the operators 
((b/i)d/dx — £0) and (multiplication by) (x — xo). 
When ||4|| 2 1 and xo (mean position) and £0 (mean 
momentum) are defined by xo := = fa xul" dx and 
£o :— (b/i) fg u(x) - a(x) dx, this inequality expresses 
the impossibility for a quantum particle to have a 
simultaneous small localization in position and 
momentum. 

Consequently, the maximal number of *quantum" 
particles which can live in the region (pr(x,£) € 0] is 
approximately (up to some universal multiplicative 
constant) the volume of this region divided by (27h)”. 


5 1/2 
ax) , Vue S(R) 


Lieb-Thirring inequalities and Scott's conjecture In 
the case of regular potentials, we have seen that the 
quinis h" Nj (E) was asymptotically equal as h — 0 
to Le! | fenci (E — V(x)P*"? dx). For other ques- 
tions occurring in atomic physics (see Stability of 
Matter), one is more interested in the existence of 
universal constants M, , such that 


b"Ni(E) € Men | Å | uc = pi ax) 


for any V and any P. 

The best M,, (which exists if s-- 5/2 > 0) is 
denoted by L,, (for s—0; this is called the 
Cwickel-Lieb-Rozenblium inequality). The semi- 
classical result gives the inequality Ls, n > E. 

A still open question is the so-called Lieb-Thirring 
conjecture: do we have Li1,3 = L$! 3? This is related to 
the question of the stability of the matter (see Stability 
of Matter). The last results in this direction have been 
obtained quite recently by A Laptev and T Weidl, 
who show, for example, the equality for s > 3/2. 

The control, when s — 1, of a second term (for more 
singular potentials) for Nj(E) was the object of the 
Scott conjecture, which was solved recently in many 
important cases by Hughes, Siedentop-Weikard, 
Ivrii-Sigal, and Feffermann-Secco (see Ivrii (1998), 
Stability of Matter, and references therein). 


Localization of the eigenfunctions The localization 
property was already observed on the specific case of 
the harmonic oscillator. But this was a consequence 
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of an explicit description of the eigenfunctions. This 
is quite important to have a good description of the 
decay of the eigenfunctions (as h — 0) outside the 
classically permitted region without having to know 
an explicit formula. Various approaches can be used. 

The first one fits very well in the case of the 
Schrödinger operator (more generally to h-pseudodiffer- 
ential operators with symbols admitting holomorphic 
extensions in the € variable) and gives exponential 
decay. This is based on the so-called Agmon estimates 
(developed in the semiclassical context by Helffer- 
Sjostrand and Simon). We shall not say more about 
this approach, which is the starting point of the analysis 
of the tunneling (see Helffer (1988), Dimassi and 
Sjöstrand (1999), and Martinez (2002)). 

The second one is an elementary application 
of the h-pseudodifferential formalism which will 
be described later and leads, for example, to the 
following statement. Let E in I and let (A(5;), 
Ppp)(x)) be a sequence of spectral pairs in I x 
LR"), where b;—0 as j— --oo,A(b;) — E, and 
x= p(x) is an L*-normalized eigenfunction 
associated with A(h;). Let Q be a relatively compact 
set in R” such that 

V^(|-oo,E]) NN — 0 
Then, there exists, for all integer N, a constant Cy. o 
such that 


zen 


A third one uses the notion of frequency set and 
will be discussed later (see also the book of Martinez 
(2002) for what can be done with the Fourier-Bros- 
lagolnitzer transform as developed by J Sjöstrand). 


N 
L? (N) < CN o i b; 


Brief Introduction to the 
h-Pseudodifferential Calculus 


For fixed h, the pseudodifferential calculus has a long 
story starting in its modern form in the 1960s. A 
rather achieved version of the calculus is presented in 
H6rmander (1984). We will emphasize here on the 
semiclassical aspect of the calculus, that is, on the 
dependence of the calculus on the parameter h > Q. 


h-Pseudodifferential Calculus 


Basic calculus: the class S% We shall mainly discuss 
the most simple one called the S? calculus. Let us 
first say that the S° calculus is sufficient once we 
have suitably (micro)-localized the problem (e.g., by 
the functional calculus). Note that it is also 
sufficient for the local analysis of many problems 
occurring on compact manifolds. 


This class of symbols p is simply defined by the 

conditions: 

jog ac p(x, €)| € Cag [7] 
for all (o, 8) € N” x IN". The symbols can possibly 
be 5-dependent. With this symbol, one can associate 
an h-pseudodifferential operator by [3]. This opera- 
tor is a continuous operator on S(R") but can also 
be defined by duality on S'(R"). 

The first basic analytical result is the Calderon- 
Vaillancourt theorem (see Hórmander (1984)) estab- 
lishing the L?-continuity. We also mention that if p 
is in L^(R^"), the associated operator is Hilbert- 
Schmidt. One can also give conditions on p implying 
the trace-class property (replace the uniform control 
in [7] by a control in L’). 

The second important property is the existence of 
a calculus. If a is in $? and b is in S" then the 
composition a"(x,bD,)ob"(x,bD,) of the two 
operators is a pseudodifferential operator associated 
with an bp-dependent symbol c in S°: 


a" (x, bD,) o b" (x, bD,) = c" (x, bDy; b) 


We see here that we immediately meet symbols 
admitting expansions in powers of h, which we shall 
call regular symbols, in the sense that they admit 
expansions of the type 


a(x, €; b) Es 
b(x,&; b )v dBi x, E)P 


In this case the Weyl symbol c of the composition 
has a similar expansion: 


ib 
c(x,&;b) ~ exp ($0. Dj — Dy - po 
x (a(x, & h) - boy m) 


X=Y; £z 


The symbol ao is called the principal symbol. At the 
level of principal symbols, the rule is simply that 
the principal symbol of a"o b" is the product of 
the principal symbols of a" and b™: co = ado: bo. 
Another important property is the following corre- 
spondence between commutator of two operators 
and Poisson brackets. The principal symbol of the 
commutator (1/b)(a* o bY — bY o aV) is (1/1)(ao, bo], 
where (f, g] is the Poisson bracket of f and g: 


f.g) (x, €) = Hyg 
di > (sf Og — Oxf - Org) 


About global classes The class S? is far from being 
sufficient for analyzing the global spectral problem 
and we refer the reader to Hórmander (1984) or 
Robert (1987) for an extensive presentation of the 
theory and for the discussion of other quantizations. 
Our initial operators (think of the harmonic oscilla- 
tor) do not belong to these classes of pseudodiffer- 
ential operators. We are consequently obliged to 
construct more general classes including these 
examples in order to realize this localization. Once 
such a class is introduced, one of the main points to 
consider is the existence of a quasi-inverse (or 
parametrix) for a suitably defined elliptic operator 
of positive order. Following Beals-Feffermann 
(see also the most general Hórmander calculus 
in Hórmander (1984) and references therein), we 
introduce a scale function (possibly 5-dependent; 
typically, m(x, & b) — b" mo(x, €)) (x, £) 5 m(x, 6; b) 
and C™ strictly positive weight functions ¢ and 
® such that ¢-®>1. All these functions are 
strictly positive and should satisfy additional 
conditions on their variation and growth. The 
class of symbols $'*8(;;, 9, 5) is defined by 


ID? D?p(x, & b)| < Cag m(x, E; b)ó(x, €) ^ b(x,£) ^ 


These apparently complicated estimates permit 
actually the control of the variation of the symbol 
in reference balls defined by 


$ ^ (xo, £o)|x — xol” + 9? (xo, &))|£ — Eol" < c 


Elliptic theory As noted above, the main point is to 
have a large class of invertible operators, such that 
the inverses are also in the class. This is what we call 
an elliptic theory and the typical statement is: 


Theorem 2 Let P be an b-pseudodifferential operator 
associated with a symbol p in S™®(m, ¢, ®). We assume 
that it is elliptic in the sense that 1l/p belongs to 
S*8(1/m, ġ, ®). Then there exists an b-pseudodifferen- 
tial operator Q with symbol in S®(1/m, $, ®), such that 


OP=I+R; POQ=I+S 


The remainders R and S are pseudodifferential 
operators witb symbols in 


b N 
(G5) 9) 
ox 
These remainders are called “regularizing.” Note 
that this notion depends strongly on the choice of the 


class of pseudodifferential operators! When $=  — 1, 
we are just inverting modulo a remainder whose norm 
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in £(L?) is O(b*) (or simply O(h) at the first step). 


With other weights like ¢= = 4/1 + |x|” + |£?, we 


invert P modulo a remainder, which has, in addition, a 
distribution kernel in the Schwartz space S(R" x R”). 
The invertibility modulo a compact operator (which 
implies the Fredholm property) is a consequence of the 
assumption 
lim (x, €)®(x,&) = +00 
|x|+|€|++00 

The proof is rather easy, once the formalism of 
composition and the notion of principal symbol have 
been understood. One can indeed start from the 
operator Oy of symbol 1/p and observe that Op P — I + 
R; holds, with Ri in Op”(S((h/®-¢),¢,®)). The 
operator (I+ Ri Qo ^ (270 (- 1) R/)Oo gives 
essentially the solution. 


Essential Self-Adjointness and Semiboundedness 


We now sketch two applications of this calculus in 
spectral theory. We shall usually consider in our 
applications an P-pseudodifferential operator P, 
whose Weyl symbol p is regular, that is, admitting 
an asymptotic expansion: 


(HO) p(x,& b) ~ >_ bipi(x,é) 


j20 


(We refer to Robert (1987), Hórmander (1984), and 
Dimassi and Sjóstrand (1999) for a more precise 
formulation). Moreover, we assume that 


(H1) (x,&) = p(x,g5h) ER 


This implies, as can be immediately seen from [3], 
that p" is symmetric (= formally self-adjoint): 


(p'"u,v)j—(up"v), Vu,v E€ S(R") 


The third assumption is that the principal symbol is 
bounded from below (and there is no restriction to 
assume that it is positive) 


(H2) po(x,) 2 0 


This assumption implies that the operator itself is 
bounded from below. This result belongs to the 
family of the so-called Garding inequalities. More 
precisely, the assumption (there are other quantiza- 
tions, e.g., the anti-Wick quantization, for which 
this result becomes trivial, the difference between 
the two quantizations being O(h)) will basically 
give, if m > 1, the existence of a constant C such 
that, for any u € S(R"), 


(Pu,u) 72,72 > —C hl|ul|? 
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Everything is proved if »(x,£) — (po + 1) is a scale 
function, if p; and their derivatives are controlled by 
(Po + 1): 


(H3) [0202 p;(x,€)| <Ca,a,;(Po + 1)o(x,€) ^ ^t 
x (x, E) 79 


for all (0,8) € IN" xN”, and if there is a suitable 
control of the family (N € N) of symbols 


bd (N+1) N p 
S3 (^ p ") 


Under these assumptions, the main result is that P 
is, for h small enough, essentially self-adjoint. This 
means that the operator which was initially defined 
on S(R") by the pseudodifferential operator with 
symbol p admits a unique self-adjoint extension. 


The Functional Calculus 


It is well known by the spectral theorem for a self- 
adjoint operator P that a functional calculus exists 
for Borel functions. What is important here is to find 
a class of functions (actually essentially C3°) such 
that f(P) is a pseudodifferential operator in the same 
class as P with simple rules of computation for the 
principal symbol. 

We are starting from the general formula (see 
Dimassi and Sjóstrand (1999)) 


AP) = i -r tim ff | X (y (5 — P)! dxdy 
e Im z|»« 


which is true for any self-adjoint operator and any f 
in CX(R). Here the function (x, y) —f(x,y) (note 
that z—x +iy) is a compactly supported, almost 
analytic extension of f to C. This means that f =f 
on R and that for any N € NN there exists a constant 
Cw such that 


N 


of 


3; V z) Im z 


& Cy 


The main result due to Helffer-Robert (see also 
Dimassi and Sjöstrand (1999) and references 
therein) is that, for P an b-regular pseudodifferential 
operator satisfying (H0)-(H3) and f in C; (R), the 
operator f(P) is an h-pseudodifferential operator, 
whose Weyl symbol p;(x,&;b) admits a formal 
expansion in powers of h: 


~  bipei x, £) 


j>0 


pr(x,& b) ~ 


with 
Pro = f (po) 
pri = pi: f (po) 
2j—1 
pr;= > (1) (ke!) df (po), Vi > 2 
k=1 


where the d; , are universal polynomial functions of 
the symbols OE pe, with |a| + |8| + £ € j. 

The main point in the proof is that we can construct, 
for Imz Z 0, a parametrix (— approximate inverse) for 
(P — z) with a nice control as Im z — 0. The constants 
controlling the estimates on the symbols are exploding 
as Imz— O but the choice of the almost analytic 
extension of f absorbs any negative power of |Im z]. 

As a consequence, we get that if, for some interval 
I and some eo > 0, 

(H4) po (I+ [—«o, eo]) is compact 
then the spectrum is, for h small enough, discrete in I. 

In particular, we get that, if po(x,£) — +00 as 
Ix| + |£| — +00, then the spectrum of P is discrete 
(P, has compact resolvent). Under the assumption 
(H4), we get more precisely the following theorem. 


Theorem 3 Let P be an h-regular pseudodifferen- 
tial operator satisfying (H0)-(H4), with I=[E,, E;], 
then, for any g in C>([Ei,E2]), we have the 
following expansion in powers of h: 


trig(P(h))| ~ b^" V " PT; (g) 


j20 


."asb—0 


where g++T;(g) are distributions in D'(|E;, E2[). 
In particular, we have 


Tolg) = (2x) f / g(po(x, £)) dx dé 
Ti(g) = (2x) f J g (po(x,£))p1(x, €) dx dé 


This theorem is just obtained by integration of the 
preceding one, because in these cases the trace of a 
trace-class pseudo-differential operator Op"(a) is 
given by the integral of the symbol a over 
R^" = Rix RZ. According to [3], the distribution 
kernel is given by the oscillatory integral: 


Koy 月 = (27h)™ [exp (; (xy): t) 


aci 


S h) dé [8] 


and the trace of Op" (a) is the integral over R” of the 
restriction to the diagonal of the distribution kernel: 


K(x,x) = (nh) | a(x, &;h) dé 
JR" 

Of course, one could think of using the theorem 
with g, the characteristic function of an interval, in 
order to get, for example, the behavior of the counting 
function attached to this interval. This is of course not 
directly possible and this will be obtained only through 
Tauberian theorems (Hórmander (1968), (1984), Ivrii 
(1998)) and at the price of additional errors. 

Let us, however, remark that, if the function g is not 
regular, then the length of the expansion depends on 
the regularity of g. So it will not be surprising that, by 
looking at the Riesz means, we shall get a better 
expansion when s is large. 

Anyway, one basic interest of functional calculus is 
to permit a localization in the energy of the operator. 
For a general h-pseudodifferential operator, it could be 
difficult to approximate an operator like exp(—itP/h) 
by suitable Fourier integral operators but approximate 
exp(—itP/h)f(P) for suitable compactly supported f 
could be easier. 

Another interest is that for suitable f (possibly 
b-dependent) the operator f(P) could have better 
properties than the initial operator. This idea will, for 
example, be applied for the theorem concerning 
clustering. It appears, in particular, very powerful in 
dimension 1, where we can in some interval of energy 
find a function £ — f (t; b) admitting an expansion in 
powers of h such that f(P; b) has the spectrum of the 
harmonic oscillator. This is a way to get the Bohr- 
Sommerfeld conditions (see Helffer-Robert (1987), 
together with Maslov (1972), Leray (1981), or the 
thesis of A Voros in 1977), which reads: 


f (A,(b); b) ~ (2n -- 1)». modulo O(h”) 


h-Fourier Integral Operators 
and Evolution Operators 


Classical Mechanics 


Let us come back to the Hamilton equations [1]. 
The local existence of solutions is well known. If, in 
addition, we assume (H4), the energy conservation 
law implies global existence for these solutions, if 
the initial data (y,7) belong to p (I). 

We recall that (y, n) ^ (y, n) = (x(t, y, n), Elt, y, n)) 
defines for any t a canonical transformation, that is, a 
diffeomorphism respecting the symplectic 2-form: 


a=) d& ^ dx; 
j 
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We shall denote by A’ the graph of $' which is a 
Lagrangian submanifold (which means that at each 
point m of the manifold the restriction of the 
symplectic two-form to TmA* is 0) for the 2-form 
on R” x R”: la dn; A dy; = Pus d£; ^ dx;. 

When the projection (y, x,£) — (9,x) gives a 
local system of coordinates for A; (and this will 
always be the case for (1, x) in a compact set and t 
small enough), one easily finds, using the Lagran- 


gian character of A‘, a function (n, x) + S;(x, n) such 


that 
A' ES lm x, E| y ee nSt, E "T 0,5,j 


This function is only defined modulo an arbitrary 
function of £. In order to get a more natural 
choice, we consider the Lagrangian submanifold 
in R^" x R?" x R? defined as 


A= {9.9% 6, t,T|(x, €) "S $'(y,p,T-— —po(x,&)] [9] 


The parametrization of A, by its projection 
(y,j,x,£,t,T) 5 (9,x,t), will now give a natural 
function (n, x,£) — S(t,x,7) =S;(x,7) describing A by 


A= {y, 9%, 62,716 = 0.8, y = O.5,7= Go} [10] 
We observe that we can choose 
S(0,x,9) — xen [11] 


and that S is automatically a solution of the Hamilton- 
Jacobi equation 


(OS) (5,0) + po(x,0:8(5,x,)) =O — [12] 


also called the eiconal equation. 
We also observe the following property (by 
comparison of [9] and [10]): 


9  (O5S(t, x, 7). m) = (x, S(t, x,7)) 


We have actually an explicit expression of S(t, x, n) in 
term of the inverse y(t, x, 7) of the map y+> x(t, y, 7): 


S(t,x,7) = y(t, xn) H 
+ | [EE - (cb) (3.990), £8) 


—p(y; 2] ds /y=y(t,xn) 
For the harmonic oscillator, easy computations give 
p(x,£) —-3(& +2"), Hp = (€,-x) 
(y, n) = (ycost + nsint, —y sin t + 1] cos t) 
and 


xen 


S(t, = —l(x* + 7”) - (tant 
(t,x, 7) (x + 7°) (tant) + oy 
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Fourier Integral Operators 


We have already given in [8] the distribution kernel 
of an b-pseudodifferential operator. It appears useful 
to generalize this point of view by considering more 
generally objects defined similarly as 


1 


K(x,y:h) =(2nh)~ [exp (969.0) Jala, 6:4) dO 


There are a lot of examples entering in this frame- 
work. The representation of the metaplectic group in 
L*(R”) appears to be in this class, with the specificity 
that the phase is quadratic (Guillemin and Sternberg 
(1977)). A quite elementary case corresponds to the 
case when N=0 and ¢(x,y)=x-y. No @ variable is 
present and so no integration appears with respect to 
0. When a=1, this defines essentially the Fourier 
transform. Under suitable conditions on 由 and a, one 
can show that the associated operators are contin- 
uous on S(R") (this is, of course, the case for the 
Fourier transform). This was done by Asada and 
Fujiwara, who transpose the theory developed by 
Hormander (1971) in this context, and we should 
also mention the older (but more formal) work by 
Maslov (1972) (see also Leray (1981)). We actually 
do not need it in the semiclassical context because 
the case when the amplitude is with compact support 
is sufficient. 

The basic object is first to look, thinking of 
the stationary-phase theorem, which gives the 
main contribution as h — 0 in this “formal integral” 
(see Stationary Phase Approximation), at the critical 
set Cue 


Cy = {(x,y,0) € R” x R” x R'|(8gó)(x, y, 0) = 0} 


In the case of a pseudodifferential operator, we find 
that it is included in (x = y}. Then we associate the 
canonical object, which is a Lagrangian submanifold 
called Ay and defined as 


Ag = {(x, €, y, m)[30 s.t. = V«ó(x, y, 0), 
- am — Vy(x, y, 0), Voo(x, y. 0) 一 0j 


The assumptions on $ (which are omitted here) are 
given in order to get that A, is a regular manifold at 
least on the support of a. The associated operators 
are called Fourier integral operators (FIOs). 
L. Hórmander (1971, 1984) has developed a general 
and more intrinsic machinery but with a homo- 
geneity condition on the phase which is irrelevant in 
the semiclassical context. This theory permits also 
the reduction to normal forms for Hamiltonians in 
continuation of what can be done in classical 
mechanics. 


Quantum Evolution 


We just sketch how one approximates the operator 
exp (—i£P/b) by an FIO. The formal construction is 
probably rather old (Maslov 1972, Fedoryuk and 
Maslov 1981) but the rigorous approach with 
estimates of the remainders was first considered by 
J Chazarain with rather strong assumptions. It has 
been later realized that we need only a local 
approximation of this operator and everything 
becomes easier. 

The first approach followed by Helffer-Robert (see 
Robert (1987)) is to localize in energy, within the 
functional calculus associated to the operator P. If I is 
an interval and x is with compact support in J, it 
appears to be easier to approximate exp (—itP/h)x(P) 
when P satisfies (H4) in a neighborhood of I. 

We do not need any more assumptions at oo and 
the composition by x(P) localizes the construction. 

Although this construction is simple because we 
remain within a functional calculus which involves 
only functions of P, it is not always sufficient to 
localize in energy. We have then to localize through 
more general h-pseudodifferential operators and 
consider exp (—i£P/b)a" (x, bD,), where a is a sym- 
bol with compact support. We shall quickly develop 
the first approach. The result is that one can 
approximate U,(1£) :— x(P) exp (—itP/h) by a Fourier 
integral operator of the form 


E *OSC i 
Ky, (t, x,y;b) = (27h) | exp (^g (Sta m) -y-1)) 
x d, (t, xn; b) dy 


with dy ~ }>,d,,jh’, in order to have 


| itP 


x(P)exp(—) -Ke 
Writing that U,(t) is a solution of (hD, + P)U, — 0, 
(U,)(0) 2 x(P), and expanding in powers of 5, one 
gets a sequence of equations permitting to determine 


recursively the symbols. The first one was analyzed 
in [12] and reads, in the case when P=—h*A + V: 


= O(h®) 


£(L2) 


(aS) (t,x, Q) + |VxS(t, x, m^ + V(x) = 0 


with the initial condition S(0, x, 7) =x : 7. 

This has been solved for t small enough. The other 
equations are called transport equations. The first 
one is, for a(t,x,7) — d, o(t, x, 1), 


Oa + (spo) (x, O,S(t, x, m) : Oa + ca = f 


with initial condition a(0, x, ]) = x(po(x, 7)). 

This type of equation is easily solved by integra- 
tion along the integral curves of the vector field 
0, + (eo) x, DyS(t, x, n) -x 


Applications 
The Frequency Set 


One has already met the question of localization of 
the eigenfunctions. It appears important to give this 
localization, not only in position (in domain of R”) 
but directly in the phase space. This can be 
described by the notion of frequency set attached 
to a bounded family u, of functions in L^(R") (or 
more generally of distributions in S'(R")). Here h 
belongs to an interval (0, bho] or more generally to a 
subset of R^ having 0 as accumulation point. 


Definition 4 We shall say that (xo, ĉo) € R” x R” 
does not belong to the frequency set of the family u, 
and write (xo, £o) € FS(u,), if there exists a compactly 
supported function @ equal to 1 in a neighborhood of 
xo and a neighborhood Y of £o in which the 5-Fourier 
transform of u, satisfies, as h — 0, 


f es (- -i $) ó(x)ujy (x) dx = O(b?*) in V 


For example, the frequency set FS(u,) of 
u(x) = x(x) exp (10(x)/b) with compactly supported 
x is contained in {(x, €) |x € supp x, £ — VxO(x)}, and 
the frequency set of the coherent state, 


X» V, p(X) -— p "4 " exp (eum 


2 
ce [- 52") 


is reduced to a point (y, 7). 

In this semiclassical context, this notion seems 
to have been introduced by Guillemin and Sternberg 
(1977) and is further discussed in the book of Robert 
(1987) (see references therein). This is the semiclassi- 
cal analog of the well-known notion of wave front set 
of a distribution introduced by Hórmander (1984) 
in the C*-category for describing the singularities 
of a distribution, but note that a major difference is 
that the frequency set is attached to a family. If P is 
an h-pseudodifferential operator with symbol in S?, it 
is possible, as a consequence of the elliptic theory, 
to prove that: FS(Puj) C FS(uy,)). For an FIO F 
attached to a canonical relation &, we get similarly: 
FS(Fup)) ‘= K(FS(uj;)). 

We also get a microlocal version of the localization 
result for the eigenfunctions mentioned in the first 
section (using again the parametrix construction). 


Theorem 5 Let E be in I and let (X(h;), oj, (x)) bea 
sequence in I x L^(R"), where X(b;) + E and b; — 0 as 
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j 5 00,x ^ p(x) is an associated eigenfunction to 
A(hj) with norm 1. Then 


FS(ót)) C po (E) 
Moreover, the frequency set of the family p) is 
invariant under the Hamiltonian flow ¢'. 


The last statement in the above theorem is the 
analog of the theorem on the propagation of 


. singularities for the solution of a partial differential 


equation (PDE) (see Hórmander (1984)) and is a 
consequence of the Egorov theorem, which will be 
presented in the next subsection. 

Another remarkable property is that (see, e.g., the 
report on the lecture of T Paul in Rauch and Simon 
(1997), say, in the case of dimension 1, when P is a 
harmonic oscillator, then exp(—it/bP)v, ,,, is a 
coherent state attached to ¢‘(y, 7). 


Egorov's Theorem 


Egorov's theorem plays a central role in the classical 
theory of PDE by permitting to reduce the study of 
general differential operators to the study of simpler 
model operators, the simplest one being 9/Óx,, (see 
Hoórmander (1984)). We use it here in a simple form, 
given in the semiclassical context by Robert (1987), 
and which will play an important role in the study 
of ergodic situations (see Quantum Ergodicity and 
Mixing of Eigenfunctions, and references therein). 
The theorem is the following: 


Theorem 6 Let P satisfy assumptions (HO)-(H3). 
For all a’s in S? with compact support and all t € R, 
we bave 


| exp (-i ; P) a" (x, bD,) exp (iz P) 


—ay (x, bD.) = O(h) 


£(12) 


where 


a;(x,€) = a(d' (x, €)) 


and à! is the flow of Hy,, where po is the principal 
symbol of P. 


The proof is based on the study of the operator 
exp (—i(t/h)P) a" (x, bD,) exp (i(/b)P), which appears 
as the composition of three FIOs. But the Lagrangian 
manifold associated with this composition is the graph 
of the identity, and this is consequently a pseudodiffer- 
ential operator whose "principal" symbol can be 
computed modulo O(h) as a(ó' (x, £)). Asan immediate 
consequence, FS(exp ( —i£P /b)uy) = '(FS(u;,)). 


710 h-Pseudodifferential Operators and Applications 


The Poisson Relation 


We start from the harmonic oscillator 


Its spectrum is given by (n+ 1/2)h (n > 0). Its 
symbol is ao(x,€)=(1/2)(€* +. x?) and the corre- 
sponding flow, for any strictly positive level E, is 
periodic with primitive period 27. The quantity we 
are interested in is 


Sh(t) := > x(G + 2)5) exp Citi +3)) 


jeN 
Using the classical Poisson relation, 
Y > f(k) exp(ikx) = (22) - V ^ f(x + 2kz) 
keZ kcz 
one shows rather easily that the frequency set of S, is 
FS ($7) = ((2kn, 7)|T > 0, 
T E€ supp x, k E Z) U (R x {0} 


This admits the following generalization, initiated in 
this context by Chazarain. 


Theorem 7 Let P satisfy (HO)-(H4). Let x be a 
function with compact support in I and let 
t — f. (t; b) be the family of distributions defined by 


fx(t; b) = cr 人 (ep( 7 xm) 


Then FS(f,) is contained in 


{(t,7)|7 € supp (x) and A(x, £) s.t. 


According to the definition, we have to study 


J exp (- Z) 0(t)f, (t; b)dt 


This takes the form 
f elix 0) exp; (—tr + S(t,x,n) — xn)dt dx dn 


and can be analyzed by a nonstationary-phase 
theorem, in order to determine for which value of 
T the quantity is O(5* ). 


Gutzwiller's Formula 


The Gutzwiller formula was established formally by 
Gutzwiller (1971). It then appears in the context of 
high-energy spectral asymptotics in contributions of 
Colin de Verdiére, Chazarain, and Duistermaat 


and Guillemin (see Duistermaat and Guillemin (1975), 
Hormander (1984), Guillemin and Sternberg (1977); 
see also Semi-Classical Spectra and Closed Orbits and 
Quantum Ergodicity and Mixing of Eigenfunctions). In 
the semiclassical context, the simplest statement (cf. 
Chazarain, Helffer-Robert, Guillemin-Uribe, Mein- 
rencken, Paul-Uribe, Dozias, Combescure-Ralston- 
Robert — see Robert (1987), Rauch and Simon (1997), 
Dimassi and Sjöstrand (1999), and in the recent article 
by Combescure et al. (1999) for techniques involving 
coherent states) can be presented in the following way. 
For a noncritical E, we introduce the energy surface 
Wr — (w € T*R" | po(w) =E}. Let P(b) an b-pseudo- 
differential operator satisfying (HO)—(H4), with J = {E}. 
We also assume that 


(Cl) The restriction of the flow Poy to Wr is clean. 
(A flow ó', associated with a C*-vector field X 
on a manifold W, is called clean if the two 
following properties are satisfied: 

e the set l'—((t,:w) € R x W |ó'(w)-—w) is a 
submanifold of R x W; 

è in each point y=(t,w) of T, the tangent 
space to [ is given by T,[={(7,v) € Rx 
To W |rX(w) + (Dó')(w) -v=v}.) 


Then there exists a sequence of distributions 
yk € D'(R), such that, for all ó € S(R) with com- 
pactly supported Fourier transform, we have the 
asymptotic expansion in powers of þh: 


ó(b * (A(b) — E)) 


A;(b)e|E—«o/2,E--«o/2] 
oc ^ x 
~S x(a [13] 
j=0 


Moreover, the supports of the distributions are 
contained in the set of the periods of the periodic 
trajectories of the flow contained in Wr. 

Actually, the proof gives more information on the 
structure of the different distributions. Let us just 
write the formula for ^: 


yo = (a)? a / dxd£ 60 
dA Jp(s.e) «A - 


where 69 is the Dirac measure at 0. 


Clustering of Eigenvalues 


We shall mention one typical result due to Chazarain- 
Helffer-Robert in this context, but inspired by 
previous results obtained for the Laplacian on compact 
manifolds (see Semi-Classical Spectra and Closed 
Orbits, Quantum Ergodicity and Mixing of Eigenfunc- 
tions and references therein, including Chazarain, 
Duistermaat-Guillemin, and Colin de Verdiere). 


Clustering means that the spectrum is concentrated 
around a specific sequence tending to oo. This was 
observed in the case of the Laplacian on the sphere 
S"-! by explicit computations. Here we assume 
that, with I = [E4,E5], the conditions (H0)-(H4) are 
satisfied and that 


e (H5) [E,, E2] does not meet the set of critical 
values of po. 

e (H6) VE € [E; — e, E2 + ce], We is connected. 

e (H7) VE e |E; —e& E2 + c], the Hamiltonian flow 
associated with po is periodic, with period T(E) > 0, 
on Wẹ (with T(E) bounded). 

e (H8) VE € [E; — €, E2 - e], the subprincipal pi 
vanishes on Wr. 


Then, under these conditions, one first observes that for 
a suitable C2 -function f defined in a neighborhood of 
[E,, E2], the period of the Hamiltonian flow associated 
with f (po) can be chosen as constant and equal to 27. 
Extending the function f suitably, one can then state the 
following result of Chazarain-Helffer-Robert: 


Theorem 8 There exists ho and C such that, for 
0cb < ho, 


o(f (P(b))) n [Ei, E2] c. LJ (b) 


kez 


wbere 


1 (pb) =|- 5 — jut kh — CP, 


S h A 
-Š gut kh + OP 


S= | dx — 27E 


for some (hence for any) periodic trajectory ^; of period 
27, and u is the Maslov index of this trajectory. 


Moreover, one can compute the multiplicity, in each 
of the intervals l}. The property remains true 
(e.g., Dozias proved this (see Rauch and Simon 
(1997)) in the case when the assumption is made only 
for one energy E, but in intervals [E — ab, E + ab], 
where a can be large but h is small enough. 


Remark 1 These results appear first in the context of 
high energy for Laplacians on compact manifolds. After 
illuminating contributions by physicists like Balian- 
Bloch, the main ideas (see the presentation in Semi- 
Classical Spectra and Closed Orbits) appear in the 
works of Colin de Verdiére, Chazarain, Duistermaat- 
Guillemin (1975), and Weinstein (see also Hórmander 
(1984) and Quantum Ergodicity and Mixing of Eigen- 
functions). The proof given in the semiclassical context 
is actually more general (it contains the case of the 
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Laplacian on a Riemannian manifold) and shows that 
the results are true for general Hamiltonians. 


Remark 2 (the case of dimension 1). In this 
particular case, the flow is periodic and the above 
theorems gives the localization of the problem predicted 
by the Bohr-Sommerfeld relations and the computation 
of the multiplicity gives »,(5) — 1 for h small enough. 
This point of view was developed by Helffer-Robert 
(1987) (see Semi-Classical Spectra and Closed Orbits). 


Similar properties have been extended to the case 
of integrable systems by Colin de Verdiére in the 
high-energy context and in the semiclassical context 
by Charbonnel and Ivrii (see Ivrii (1998), Dimassi 
and Sjöstrand (1999), and references therein). 


Remark 3 Another interesting application of semi- 
classical analysis concerns the Schnirelman theorem 
treating the case when the flow is ergodic. We refer 
the reader to Quantum Ergodicity and Mixing 
of Eigenfunctions for references and to Hlelffer- 
Martinez—Robert (see Rauch and Simon (1997) for 
references) for the specific statement for general 
Hamiltonians in semiclassical analysis. 


Conclusions and Suggestions 
for Further Reading 


In this brief survey we have tried to present some of the 
foundational techniques appearing in the “mathemati- 
cal” semiclassical analysis. Of course, this is very 
limited, and semiclassical methods go far beyond the 
verification of the correspondence principle. One can 
refer to semiclassical analysis for many other problems 
where the same analysis (with a small parameter ^) is 
relevant but where / is no more the Planck constant. 
This could be a flux (Harper’s equation) or the inverse 
of a flux, the inverse of a mass (Born—Oppenheimer’s 
approximation), of an energy, or of a number of 
particles. We have not developed this point of view here. 

The books given in the bibliography will allow the 
reader to discover other fields. The books by Robert 
(1987), Helffer (1988) and Dimassi and Sjöstrand 
(1999) present the basic statements of the theory. The 
book by Martinez (2002) is more “microlocal” in spirit. 
The lectures published in Rauch and Simon (1997) give 
a rather good idea of the state of art in the middle of the 
1990s, and we also refer the reader to other articles in 
this encyclopedia for the presentation of the resonances 
(see Resonances), spectral problems connected with 
ergodicity (see Quantum Ergodicity and Mixing of 
Eigenfunctions), Kolmogorov-Arnol'd-Moser theory 
(see Normal Forms and Semi-Classical Approxima- 
tion), and trace formulas (see Semi-Classical Spectra 
and Closed Orbits). The book by Ivrii (1998) gives the 
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most sophisticated theorems on the counting functions 
(including boundaries, singularities,...) but is only 
written for specialists. 


See also: Normal Forms and Semiclassical 
Approximation; Quantum Ergodicity and Mixing of 
Eigenfunctions; Resonances; Schródinger Operators; 
Semiclassical Spectra and Closed Orbits; Stability of 
Matter; Stationary Phase Approximation. 
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Definitions 


The Hubbard model is a standard theoretical model 
for strongly interacting electrons in a solid. It is a 
minimum model which takes into account both 
quantum many-body effects and strong nonlinear 
interaction between electrons. Here we review rigor- 
ous results on the Hubbard model, placing main 
emphasis on magnetic properties of the ground states. 

Let the lattice A be a finite set whose elements 
X,y,...€ A are called sites. Physically speaking, 
each site corresponds to an atomic site in a crystal. 
The Hubbard model is based on the simplest tight- 
binding description of electrons (Figure 1), where a 
single state is associated with each site. 

For each x € A and c €[(1,]], we define the 
creation and the annihilation operators cj, and 
cx,o, respectively, for an electron at site x with 
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spin ø. (A! is the adjoint or the Hermitian conjugate 
of A.) These operators satisfy the canonical anti- 
commutation relations 


cm exe = Ôx y Ser 
Fr" - Yy {Cx,05 Cyr} =0 


for any x,y € A and o,r =f, |, where (A, B] 2 AB + 
BA. The number operator is defined by 


dj 


yo = & s Cx [2] 


which has eigenvalues 0 and 1. 

The Hilbert space of the model is constructed as 
follows. Let &,,. be a normalized vector state which 
satisfies Cy ¢Pyac=O0 for any x € A and g= f,- 
Physically, 9,4. corresponds to a state where there 
are no electrons in the system. For arbitrary subsets 
Aj, A, C A, we define 


Parni = lI 8 lI a Pyac [3] 


x€A, x€A, 


(d) 
Figure 1 A highly schematic figure which explains the philoso- 
phy of tight-binding description. (a) A single atom which has 
multiple electrons in different orbits. (b) When atoms come 
together to form a solid, electrons in the black orbits become 
itinerant, while those in the light gray orbits are still localized at the 
original atomic sites. Electrons in the gray orbits are mostly 
localized around the atomic sites, but tunnel to nearby gray orbits 
with nonnegligible probabilities. (c) We only consider electrons in 
the gray orbits, which are expected to play essential roles in 
determining various aspects of low-energy physics of the system. 
(d) If the gray orbit is nondegenerate, we get a lattice model in 
which electrons live on lattice sites and hop from one site to 
another. In a simplified treatment of a metal, the black and the 
gray orbits correspond to the 4s and the 3d bands, respectively. 


in which sites in A; are occupied by up-spin 
electrons and sites in A, by down-spin electrons. 
We fix the electron number Ne, which is an integer 
satisfying 0 < Ne € 2]AJ. (We denote by |S| the 
number of elements in a set S.) The Hilbert space 
for the system with N, electrons is spanned by the 
basis states [3] with all subsets A; and A, such that 
A+ [Ay] 2 N 

We define total spin operators Stor = (SQ, $0. SE) 


for a=1,2, and 3. Here p'” are the Pauli matrices 
" 0 1 : 0 -i 
(1) _ Em 
j (i 路 ? $ 4 
| 1 0 
(3) _ 
j $ M 


The operators Stor are the generators of global SU(2) 
rotations of the spin space. As usual, we denote the 
eigenvalue of (Stor)” as Stor(Stor + 1). The maximum 
possible value of Stot is Smax = Ne/2 when Ne < |A|, 
and Smax = |A| — (Ne/2) when Ne > JA]. 

The most general Hamiltonian of the Hubbard 
model is 


H=- Dt Hw + 》 U; Ax 1 Mx, | [3 


XVE! x€A 
g= T 


[5] 
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Here the first term describes quantum-mechanical 
motion of electrons which hop around the lattice 
according to the amplitude tx, y = ty,» € R. Usually, 
tx y is nonnegligible only when the two sites x and y 
are close to each other. The second term represents 
nonlinear interaction between electrons. There is an 
increase in energy by Uy € R when the site x is 
occupied by both up-spin electron and down-spin 
electron. We usually set U, > 0 to mimic (screened) 
Coulomb interaction between electrons. 

The HamiMotman H commutes with the total spin 
operator to for a=1, 2, and 3. One can thus 
investigate simultaneous eigenstates of (Sot)? and H. 
For Stot in the allowed range, we denote by Emin(Stot) 
the lowest possible energy among the states which 
satisfy ( (So) — Stot(Stot + 1)®. 


Wave-Particle Dualism in the 
Hubbard Model 


It is illuminating to examine the eigenstates of the 
Hamiltonian [6] for the following two special cases. 

First suppose that one has U,=0 for all x € A, 
that is, the model has no interactions. For 
i—1,2,...,|A|, let 9U-(y)., E C^ be the 
single-electron eigenstate, which is the solution of 
the Schródinger equation 


-Diot = edi 


yEA 


for any x € A [7] 


We order the energy eigenvalues as c; < €141. By 
defining the corresponding creation operator by 
al = Vica UP cl, we see that, for any sub 
ic xEA x,oc? ’ y subsets 
L,1, € (1,2,...,|A]] such that |I| 4- j| 2 Ne, the 
state 


Vr = 


lI à), lI à), 中 vac [8] 


rel; i€l, 


is an eigenstate of H (with U,=0) with the 
eigenvalue E= 5 ict, €i + diet, €j. The ground states 
are obtained by choosing 11, 1; which minimize E. In 
particular, when N, is even and the single-electron 
eigenenergies e; are nondegenerate, the ground state 
is unique and written as 


N./2 
PGs = (d alal | Byac [9] 
i=1 


The fact that this ground state has the minimum 
possible spin Sjr=O is known as Pauli 
paramagnetism. 

We have seen that the Hamiltonian H with U, — 0 
can be diagonalized by using single-electron 
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eigenstates V. When (tx) has a translation 
invariance, each w behaves as a “wave.” We 
can say that the noninteracting models can 
be understood in terms of the wave picture of 
electrons. 

Next suppose that tx, y — O for all x, y € A, that is, 
the electrons do not hop. Then the Hamiltonian [6] 
is readily diagonalized in terms of the basis state [3], 
where the corresponding eigenvalue is simply 
E=} senna, Ux. In this case, the model is best 
understood in the particle picture of electrons. 

We thus see that the wave-particle dualism 
manifests itself in the Hubbard model in an essential 
manner. When both the first and the second terms in 
the Hamiltonian [6] are present, there takes place a 
“competition” between wave-like nature and particle- 
like nature of electrons. The competition generates 
rich nontrivial phenomena including antiferromagnet- 
ism, ferromagnetism, metal-insulator transition, and 
(probably) superconductivity. To investigate these 
phenomena is a major motivation in the study of 


the Hubbard model. 


One-Dimensional Model 


The Hubbard model defined on a simple one- 
dimensional lattice is easier to study. But it does 
not exhibit truly nontrivial behavior as the following 
classical theorem of Lieb and Mattis suggests. 


Theorem 1 Consider a Hubbard model on a one- 
dimensional lattice A = (1, 2, ... , N} with open bound- 
ary conditions. We assume that t, y # 0 if |x — y| — 1, 
and tz,y=0 if |x — y| > 1. t,x ER and Us € R are 
arbitrary. Then one has Emin(Stot) < Emin(Stor + 1) for 
(y Su0,1,.... Saez — (or Seg la 372, .3 


Smax 


As a consequence, one finds that the ground states 
always have Sj.=0O (or Sjr=1/2) as in the 
noninteracting models. 

The translation invariant model with t,,,=¢ if 
lx = y| 31, ty,y =0 if |x — y| A 1, and U, =U can be 
solved by using the Bethe ansatz, as was first shown 
by Lieb and Wu. It was found that the model is 
insulating for all U>0O, and there is no metal- 
insulator transition. (A metal-insulator transition is 
expected to take place in higher dimensions.) Earlier 
works on the Bethe ansatz were based on the 
assumption that the Bethe ansatz equation gives 
the true ground states. Recently, the existence and 
the uniqueness of the Bethe ansatz solution for the 
ground state of a finite system was proved by 
Goldbaum. 


Half-Filled Systems 


The system in which the electron number N, is 
identical to the number of sites |A| is said to be 
half-filled. Many (but not all) physical systems can 
be modeled as a half-filled Hubbard model. 

Based on a heuristic perturbation theory, low-energy 
properties of half-filled models with large U are 
expected to be similar to those of Heisenberg anti- 
ferromagnetic spin systems. There is no electrical 
conduction, and the spin degrees of freedom may 
show antiferromagnetic long-range order in the ground 
states. 

This expectation is partly justified by the follow- 
ing theorem due to Lieb. A Hubbard model is said 
to be bipartite if the lattice A can be decomposed 
into a disjoint union of two sublattices as A—AUB 
(with A N B=9), and it holds that t,,, — 0 whenever 
x,y € A or x,y € B. In other words, only hopping 
between different sublattices is allowed. 


Theorem 2 Consider a bipartite Hubbard model. We 
assume |A| is even, and the whole A is connected 
through nonvanishing tx y. We also assume Ux = U > 0 
for any x € A. Then the ground states of the model 
are nondegenerate apart from the trivial spin 
degeneracy, and have total spin Sto. — ||A| — |B||/2. 
It also holds that Emin(Stor) < Emin(Stot + 1) for any 
Stor > [lA — |BI/2. 


The theorem implies that, as far as the total spin is 
concerned, the half-filled Hubbard model behaves 
exactly as the Heisenberg antiferromagnet. But the 
existence of antiferromagnetic ordering has not been 
proved in any version of the Hubbard model. 

To see another implication of Theorem 2, take the 
so-called CuO lattice in Figure 2. Here the A and B 
sublattices consist of black and white sites, respec- 
tively. One has |A| 2 |A|/3 and |B| 22|A|/3. Then 
the theorem implies that the ground state of the 
corresponding Hubbard model has total spin 


Figure 2 An example (the so-called CuO lattice) of a bipartite 
lattice in which the numbers of sites in two sublattices are 
different. Lieb’s theorem implies that the half-filled Hubbard 
model defined on this lattice exhibits ferrimagnetism. 


Stot = ||A| — |B||/2 =|A|/6. Since the total spin mag- 
netic moment of the system is proportional to the 
number of sites |A|, we conclude that the model 
exhibits ferrimagnetism, a weaker version of 
ferromagnetism. 

Another interesting result for the  half-filled 


models is the following uniform density theorem 
by Lieb, Loss, and McCann. 


Theorem 3 Consider a bipartite Hubbard model. 
try €R,U,€R are arbitrary. Suppose that 
the ground states are n-fold degenerate, and let 
QU. (i—1,...,") be mutually orthogonal normal- 
ized ground states. Define the correlation function 
by p(x,y) on! 35. s (6.16.1 T cx 1,1) Ps 
((-,-) is the inner product.) Then for any x,y € A or 
x, y € B, one has p(x, y) = dx, y. 


It is interesting that the density p(x,x) in the 
ground state is always unity though the hopping 
matrix and interactions can be highly nonuniform. 


Ferromagnetism 


Ferromagnetism is an interesting phenomenon in 
which the majority of the spins in the system align in 
the same direction. One of the original motivations 
to study the Hubbard model was to understand the 
origin of ferromagnetism in an idealized situation. 
Let us recall that neither the hopping term nor the 
interaction term in the Hamiltonian [6] favors 
ferromagnetism (or any other magnetic order). One 
must deal with the interplay between the two terms 
to have ferromagnetism. Here we review three 
rigorous examples of saturated ferromagnetism in 
the Hubbard model. Saturated ferromagnetism is the 
strongest form of ferromagnetism where the ground 
state has Si = Simax: 

The first example is due to Nagaoka and 
Thouless. 


Theorem 4 Take an arbitrary finite lattice A, and 
let Ne = |A| — 1. Assume that tx, y < 0 for any x #y, 
and let Uy — oc for all x € A. (Taking the limit 
U,— oo is equivalent to inhibiting x from being 
occupied by two electrons.) Then among the ground 
states of the model, there exist states with total spin 
Stot = Smax (= Ne/2). If the system further satisfies the 
connectivity condition (see below), then the ground 
states have Stor = Smax(= Ne/2) and are nondegen- 
erate apart from the trivial spin degeneracy. 


The connectivity condition is a simple condition 
which holds in most of the lattices in two or higher 
dimensions, including the square lattice, the trian- 
gular lattice, or the cubic lattice. To be precise the 
condition requires that “by starting from any 
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Figure 3 The Hubbard model on the kagome lattice is a typical 
example which exhibits flat-band ferromagnetism. 


electron configuration on the lattice and by moving 
around the hole along nonvanishing tx,y, one can get 
any other electron configuration.” 

The requirements that U, — oo and Ne —|A| — 1 
are indeed rather pathological. We still do not know 
if the ferromagnetism extends to more realistic 
situations. Heuristic studies indicate that the issue 
is highly delicate. 

A completely different class of rigorous examples 
of ferromagnetism was found by Mielke. Take, for 
example, the kagomé lattice of Figure 3, and define 
a Hubbard model by setting ty =¢ < 0 when x and 
y are neighboring, ty — 0 otherwise, and Uy =U > 0 
for any x € A. Then the corresponding single-electron 
Schrédinger equation [7] has a peculiar feature that 
its ground states are ((|A|/3) + 1}-fold degenerate. 
This huge degeneracy corresponds to the fact that the 
lowest-energy band of the model is completely 
dispersionless (or flat). 


Theorem 5 Consider tbe Hubbard model on tbe 
kagomé lattice with N. = (|A|/3) + 1. For any U > 0, 
the ground states have Stor = Smax(= Ne/2) and are 
nondegenerate apart from the trivial spin degeneracy. 


There are similar examples in higher dimensions. 
Ferromagnetism observed in these models is called 
flat-band ferromagnetism. 

The above examples of ferromagnetism have 
either singular interaction (U, — oo) or singular 
dispersion relation (highly degenerate single-electron 
ground states). Tasaki found a class of Hubbard 
models which are free from such singularities, and 
exhibit ferromagnetism. 

For simplicity, we concentrate on the simplest 
model in one dimension. There are similar examples 
in higher dimensions. Take the one-dimensional 
lattice A—(1,2,..., N} with N sites (where N is an 
even integer), and impose a periodic boundary 
condition by identifying the site N + 1 with the site 1. 
The hopping matrix is defined by setting ty x41 = 
[x41, = —t for any x € A, fix42—tg42,« — —t for 
even -X5 ior = baza mS for odd x, and £4,,—0 
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Figure 4 An example of nonsingular Hubbard model which 
exhibits saturated ferromagnetism. 


otherwise. Here t>0O and s» 0 are independent 
parameters, but the parameter 1 is determined as 
f= /2(t +s). 

As can be seen from Figure 4, electrons are 
allowed to hop to next-nearest neighbors. Thus, 
Theorem 1 does not apply. The single-electron 
ground states are not degenerate unless s=0. We 
set Ux — U > 0 for any x € A, and fix the electron 
number as N, — N/2. In terms of filling factor, this 
corresponds to the quarter filling. 


Theorem 6 Suppose that tbe two dimensionless 
parameters t/s and U/s are sufficiently large. Then 
the ground states have Stot = Smax(= N/4) and are 
nondegenerate apart from tbe trivial spin degeneracy. 


The theorem is valid, for example, when t/s > 4.5 
if U/s — 50, and t/s > 2.6 if U/s— 100. It is crucial 
that the statement of the theorem is valid only when 
the interaction U is sufficiently large. In the same 
model, it is also proved that low-lying excitation 
above the ground state has a normal dispersion 
relation of a spin-wave excitation. 

We would like to point out that one can learn 
more details about the Hubbard model and further 


rigorous results from the review articles (Lieb 1995, 
Tasaki 1998a, Tasaki 1998b). One can also find 
references for most of the results discussed here in 
these review articles, especially in Lieb (1995). 

As for the latest results which are not included in 
the above reviews, see recent publications, for 
example, Lieb and Wu (2003), Tasaki (2003), and 
Goldbaum (2005), and references therein. 
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Introduction 


Biliards are a class of dynamical systems with 
appealingly simple description. A point particle 
moves with constant velocity in a box of arbitrary 
dimension (the billiard table) and reflects elasti- 
cally from the boundary (the component of velocity 


perpendicular to the boundary is reversed and the 
parallel component is preserved). Mathematically, 
it Is a class of Hamiltonian systems with collisions 
defined by symplectic maps on the boundary of the 
phase space. The billiard dynamics defines a one- 
parameter group of maps 4' of the phase space 
which preserve the Lebesgue measure, and are in 
general only measurable due to discontinuities. The 
boundaries of the box are made up of pieces, 
concave, convex, and flat. Discontinuities occur at 
the orbits tangent to concave pieces of the 
boundary of the box. The orbits hitting two 
adjacent pieces (“corners”) cannot be naturally 


continued, which is another source of discontinu- 
ities. These singularities are not too severe so that 
the flow has well-defined Lyapunov exponents and 
Pesin structural theory is applicable (Katok and 
Strelcyn 1986). A billiard system is called hyper- 
bolic if it has nonzero Lyapunov exponents on a 
subset of positive Lebesgue measure, and comple- 
tely hyperbolic if all of its Lyapunov exponents are 
nonzero almost everywhere, except for one zero 
exponent in the direction of the flow. 

Billiards in smooth strictly convex domains have 
no singularities, but no such examples are known to 
be hyperbolic. 

In general, billiards exhibit mixed behavior just 
like other Hamiltonian systems; there are invariant 
tori intertwined with “chaotic sea.” In hyperbolic 
billiards, stable behavior is excluded by the choice of 
the pieces in the boundary of the box, arbitrary 
concave pieces and special convex ones, and their 
particular placement. Thus, hyperbolicity is achieved 
by design, as in optical instruments. 

It was established by Turaev and Rom-Kedar 
(1998) that complete hyperbolicity may be lost 
under generic singular perturbation of the billiard 
system to a smooth Hamiltonian system. 

Hyperbolicity is the universal mechanism for 
random behavior in deterministic dynamical sys- 
tems. Under suitable additional assumptions, it leads 
to ergodicity, mixing, K-property, Bernoulli prop- 
erty, decay of correlations, central-limit theorem, 
and other stochastic properties. Hyperbolic billiards 
provide a natural class of examples for which these 
properties were studied. In this article we restrict 
ourselves to hyperbolicity itself. 

The most prominent example of a hyperbolic 
billiard is the gas of hard spheres. This way of 
looking at the system was developed in the 
groundbreaking papers of Sinai (see Chernov and 
Sinai (1987) for an exhaustive list of references). 
The collection of papers (Szász 2000) contains 
more up-to-date information. Another source on 
hyperbolic billiards is the book by Chernov and 
Markarian (2005). The books by Kozlov and 
Treschev (1990), and by Tabachnikov (1995) 
provide broad surveys of billiards from different 
perspectives. 


Jacobi Fields and Monotonicity 


The key to understanding hyperbolicity in billiards 
lies in two essentially equivalent descriptions of 
infinitesimal families of trajectories. The basic 
notion is that of a Jacobi field along a billiard 
trajectory. Let ^(t,4u) be a family of billiard 
trajectories, where £ is time and z is a parameter, 
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|u| « e. A Jacobi field along ~Y(t,0) is defined by 
Jít) = DT/ Ou, — o. 

Jacobi fields form a finite-dimensional vector 
space which can be identified with the tangent to 
the phase space at points along the trajectory. They 
contain the same information as the derivatives of 
the billiard flow D«4f'. In particular, the Lyapunov 
exponents are the exponential rates of growth of 
Jacobi fields. 

Jacobi fields split naturally into parallel and 
perpendicular components to the trajectory, each of 
them a Jacobi field in its own right. The parallel 
Jacobi field carries the zero Lyapunov exponent. In 
the rest we discuss only the perpendicular Jacobi 
fields. Between collisions the Jacobi fields satisfy the 
differential equation J” — 0, hence /(£) — J(0) + £J/(0). 
At a collision a Jacobi field undergoes a change by 
the map 


I(E) = RI (te) 


1 
(te) = RI (te) + P'KPJ(t;) " 
where J(t-) and J(t}) are Jacobi fields immediately 
before and after collision, K is the shape operator of 
the piece of the boundary (K — Vn,n is the inside 
unit normal to the boundary), and P is the 
projection along the velocity vector from the hyper- 
plane perpendicular to the orbit to the hyperplane 
tangent to the boundary. Finally, R is the orthogo- 
nal reflection in the hyperplane tangent to the 
boundary. 

Perpendicular Jacobi fields at a point of a 
trajectory can be identified with a subspace of the 
tangent to the phase space, the subspace perpendi- 
cular to the phase trajectory. To measure the 
growth/decay of Jacobi fields, we introduce a 
quadratic form on the tangent spaces, or equiva- 
lently on Jacobi fields, QO( J, J') 2 € J, J> . Evalua- 
tion of Q on a Jacobi field is a function of time Q(t). 
Between collisions we have O(t2) > Q(ti) for t? > t 
(monotonicity). By [1] the monotonicity at the 
collisions, that is, Q(ti) > O(t-) is equivalent to 
the positive semidefiniteness of the shape operator 
K > 0, it holds for concave pieces of the boundary. 
If K > 0 at a point of collision with the boundary, 
then for (J, J’) Z (0,0), we have Q(t?) > Q(ti) (strict 
monotonicity), assuming that the collision occurred 
between time 74 and t». 

In billiards with concave pieces of the boundary, 
where K > 0, K 40, strict monotonicity may still 
occur after sufficiently many reflections (eventual 
strict monotonicity, or ESM). Such billiards are 
called semidispersing, and the gas of hard spheres is 
an example. 
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The role of monotonicity is revealed in the 
following: 


Theorem 1 (Wojtkowski 1991). If a system is 
eventually strictly monotone (ESM), except on a set 
of orbits of zero measure, then it is completely 


hyperbolic. 


The theorem applies to billiard systems. It can be 
generalized and applied to other systems, not even 
Hamiltonian (see Wojtkowski (2001) for precise 
formulations, references and the history of this 
idea). 

The difficulty in applying the above theorem to 
the gas of hard spheres lies in the gap between 
monotonicity and strict monotonicity. There are 
many orbits on which strict monotonicity is never 
attained (parabolic orbits). Establishing that the 
family of parabolic orbits has measure zero (or 
better yet codimension 2) is a formidable task. It 
was brought to conclusion in the work of Simanyi 
(2002). 


Wave Fronts and Monotonicity 


There is a geometric formulation of monotonicity 
(which historically preceded the one given above). 
Let us consider a local wave front, that is, a local 
hypersurface W(0) perpendicular to a trajectory y(t) 
at £— O0. Let us consider further all billiard trajec- 
tories perpendicular to W(0). The points on these 
trajectories at time ¢ form a local hypersurface W(t) 
perpendicular again to the trajectory (warning: at 
exceptional moments of time the wave front W(t) 
may be singular). Infinitesimally wave fronts are 
described by the shape operator U = Vn, where n is 
the unit normal field. U is a symmetric operator on 
the hyperplane tangent to the wave front (and 
perpendicular to the trajectory y(t). The evolution 
of infinitesimal wave fronts is described by the 
formulas 


U(t) = (tl --U(0)  ) ! without collisions 
U(t?) = RU(t; )R.-- P*KP ata collision 


It follows that between collisions a wave front 
that is initially convex (i.e., diverging, or U > 0) will 
stay convex. Moreover, any wave front after a 
sufficiently long run without collisions will become 
convex (after which the normal curvatures of the 
wave front will be decaying). The second part of [2] 
shows that after a reflection in a strictly concave 
boundary a convex wave front becomes strictly 
convex (and its normal curvatures increase). These 
properties are equivalent to (strict) monotonicity as 
formulated above. Indeed, in the language of Jacobi 


fields an infinitesimal wave front represents a linear 
subspace in the space of perpendicular Jacobi fields, 
that is, the tangent space. (Furthermore, it is a 
Lagrangian subspace with respect to the standard 
symplectic form.) We can follow individual Jacobi 
fields or whole subspaces of them. It explains the 
parallel of [1] and [2]. The form Q allows the 
introduction of positive and negative Jacobi fields 
and positive and negative Lagrangian subspaces. An 
infinitesimal convex wave front represents a positive 
Lagrangian subspace. Monotonicity is equivalent to 
the property that a positive Lagrangian subspace 
stays positive under the dynamics (it may appear 
that there is a loss of information in formulas [2] 
compared to [1], but-actually they are equivalent 
due to the symplectic nature of the dynamics 
(Wojtkowski 2001). 


Design of Hyperbolic Billiards 


In view of [2] it seems that a convex piece in the 
boundary (K « 0) excludes monotonicity. There are 
two ways around this obstacle. First, we could 
change the quadratic form Q at the convex 
boundary. Second, we can treat convex pieces as 
“black boxes” and look only at incoming and 
outgoing trajectories. Although the second strategy 
seems more restrictive, all the examples constructed 
to date fit the black box scenario, and we will 
present it in more detail. 

To understand this approach, let us consider a 
billiard table with flat pieces of the boundary and 
exactly one convex piece. A trajectory in such a 
billiard experiences visits to the convex piece 
separated by arbitrary long sequences of reflections 
in flat pieces, which do not affect the geometry of a 
wave front at all. Hence, whatever is the geometry 
of a wave front emerging from the curved piece it 
will become convex and very flat by the time it 
comes back to the curved piece of the boundary 
again. Hence, it follows, at least heuristically, that 
we must study the complete passage through the 
convex piece of the boundary, regarding its effect on 
convex, and especially flat, wave fronts. 

Important difference between convex and concave 
pieces is that a trajectory has usually several 
consecutive reflections in the same convex piece; 
moreover, the number of such reflections is 
unbounded. A finite billiard trajectory is called 
“complete” if it contains reflections in one and the 
same piece of the boundary, and it is preceded and 
followed by reflections in other pieces. 


Definition A complete trajectory is (strictly) 
z-monotone if for every nonzero Jacobi field the 


value of the form Q (increases) does not decrease 
between the point at the distance z before the first 
reflection and the point at the distance z after the 
last reflection. 

A complete trajectory is parabolic if there is a 
nonzero Jacobi field J such that /^ vanishes before 
the first and after the last reflection. 


In the language of wave fronts, a complete 
trajectory is z-monotone if every diverging wave 
front at a distance at least z from the first reflection 
becomes diverging after the last reflection at the 
distance z, or earlier. 

It turns out that the only obstruction to mono- 
tonicity of complete trajectories is parabolicity. 
More precisely, if a complete trajectory is not 
parabolic then it is zzmonotone for some z > 0. 

It follows from Theorem 1 that we get a 
completely hyperbolic billiard if we put together 
curved pieces with no complete parabolic trajec- 
tories and some flat pieces, in such a way that for 
every two consecutive complete trajectories, being 
zı- and z?-monotone, respectively, the distance from 
the last reflection in the first trajectory to the first 
reflection in the second one is bigger than zl + z2. 
Indeed, we can put together the midpoints of 
trajectories leaving one curved piece and hitting 
another one into the Poincaré section of the billiard 
flow and we obtain immediately ESM for the return 
map. 

We can formulate somewhat informally two 
principles for the design of hyperbolic billiards. 


1. No parabolic trajectories 
boundary cannot have 
trajectories. 

2. Separation There must be enough separation (in 
space or in time through reflections in flat pieces) 
between strictly z-monotone trajectories accord- 
ing to the values of z. 


Convex pieces of the 
complete parabolic 


All of the examples of hyperbolic billiards 
constructed up to now are designed according to 
these principles. 


Hyperbolic Billiards in Dimension 2 


Checking the absence of parabolic trajectories is 
nontrivial due to the unbounded number of reflec- 
tions in complete trajectories close to tangency. It 
was accomplished so far only in integrable, or near 
integrable examples, with the exception of convex 
scattering pieces described in the following. 
Billiards in dimension 2 are understood best. First 
of all, there is yet another way of describing 
infinitesimal families of nearby trajectories. Every 
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infinitesimal family of rays in the plane has a point 
of focusing (in linear approximation), possibly at 
infinity. This point of focusing contains the same 
information as the curvature of a wave front (it is 
the center of curvature, rather than curvature itself) 
and it has the advantage that it does not change 
between collisions. The focusing points before and 
after a reflection are related by the familiar mirror 
equation of the geometric optics: 


pp pe 


fo Tig 1 d 

where fo, fi are the signed distances of the points of 
focusing to the reflection point, d=rcos@, r being 
the radius of curvature of the boundary piece (r > 0 
for a strictly convex piece), and @ the angle of 
incidence. The mirror equation is just the two 
dimensional version of [2]. 

It is instructive to consider an arc of a circle. A 
billiard in a disk is integrable due to its rotational 
symmetry. Let / be a Jacobi field obtained by 
rotation of a trajectory. This family of trajectories 
(“the rotational family”) is focused exactly in the 
middle between two consecutive reflections (that is 
where / vanishes). It follows further from the mirror 
equation that a parallel family of orbits is focused at 
a distance d/2 after the reflection, and any family 
focusing somewhere between the parallel family and 
the rotational family will focus at a distance some- 
where between d/2 and d, not only after the first 
reflection, but also after arbitrary long sequence of 
reflections. 

Hence, any complete trajectory in an arc of a 
circle is z-monotone, where 2z is the length of a 
single segment of the trajectory and strictly 
z'-monotone for any z' >z. Two arcs of a circle 
separated by parallel segments form the stadium of 
Bunimovich (1979). 

Lazutkin (1973) showed that billiards in smooth 
strictly convex domains are near integrable near the 
boundary. Donnay (1991) applied Lazutkin’s 
coordinates to establish that for an arbitrary strictly 
convex arc the situation near the boundary is similar 
to that in a circle, that is, complete trajectories near 
tangency are z-monotone, where z is of the order of 
the length of a single segment. In particular, no near 
tangent complete trajectory can be parabolic. Hence, 
this crucial calculation. shows that if a strictly 
convex arc has no parabolic trajectories then any 
sufficiently small perturbation also has no parabolic 
trajectories. It follows further that any sufficiently 
small piece of a given strictly convex arc has no 
parabolic trajectories. 

It turns out that in dimension 2, complete 
parabolic trajectories are also z-monotone for some 
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z»0 (but clearly not strictly monotone) 
(Wojtkowski 2005). However, they are still an 
obstacle to complete hyperbolicity because in general 
nearby complete trajectories are z-monotone without 
a bound for the values of z, so that no separation of 
convex pieces is sufficient. 

Integrability of the elliptic billiard allows one to 
establish strict monotonicity of trajectories in the 
semi-ellipse with endpoints on the longer axis, 
Wojtkowski 1986. Donnay (1991) showed that 
also the semi-ellipse with endpoints on the shorter 
axis has no parabolic trajectories provided that the 
eccentricity is less than 2/2. As the eccentricity 
goes to V2/2 the separation required to produce a 
hyperbolic billiard goes to infinity. Markarian et al. 
(1996) obtained explicitly the separation of the 
elliptic pieces needed for hyperbolicity, when the 


eccentricity is smaller than V2 — /2/2. 

It follows from the mirror equation that a 
trajectory with one reflection in a convex piece is 
always strictly z-monotone for z > d. Hence, if for 
any two consecutive reflections in convex pieces with 
respective values of d equal to dı and d», the distance 
between reflections exceeds d, + dh, then the billiard 
is completely hyperbolic. For one convex piece this 
condition, called convex scattering, turns out to be 
equivalent to d^r/ds? « 0, where s is the arc length 
(Wojtkowski 1986). This leads to examples of 
hyperbolic billiards with one convex piece of the 
boundary, like the domain bounded by the cardioid. 

Also, any complete trajectory in a convex scatter- 
ing piece is strictly z-monotone for z bigger than the 
maximum of the values of d for the first and the last 
segment of the trajectory. This allows to find easily 
the explicit separation of convex scattering pieces 
guaranteeing hyperbolicity. 


Hyperbolic Billiards in Higher Dimensions 


In higher dimensions, only two constructions of 
hyperbolic billiards with convex pieces in the 
boundary are known. The first construction by 
Bunimovich (1988), involves a piece of a sphere 
whose angular size, as seen from the center, does not 
exceed 7/2 (Wojtkowski 1990, 2005, Bunimovich 
and Rehacek 1998). The second construction by 
Papenbrock (2000) uses two cylinders, at 90° with 
respect to each other to destroy integrability 
(Wojtkowski 2005). In both cases, the successful 
treatment is based on integrability of the billiard 
systems bounded by a sphere or a cylinder. 

In both of these constructions, trajectories need to 
be cut into strictly monotone pieces of unbounded 
lengths. In the case of spherical caps, complete 


trajectories are z-monotone with unbounded value 
of z and the geometry of the billiard table is used to 
separate them in time by sufficiently many reflec- 
tions in flat pieces of the boundary (Wojtkowski 
2005). In the case of cylinders, trajectories are cut 
by consecutive returns to a Poincaré section in the 


middle of the billiard table. 


Soft Billiards 


The same ideas of monotonicity and strict mono- 
tonicity are applicable to soft billiards, where 
specular reflections are replaced by scatterers in 
which the point particle is subjected to the action of 
a spherically symmetric potential. As in ordinary 
billiards, we compare the wave fronts along trajec- 
tories before entering and after leaving scatterers. 
Again, in the absence of parabolic trajectories 
sufficient separation of the scatterers produces a 
completely hyperbolic system. 

The conditions on the potential that guarantee the 
absence of parabolic trajectories were obtained by 
Donnay and Liverani (1991) in the two-dimensional 
case and by Balint and Toth (2006) in higher 
dimensions. The complete integrability of the 
motion of a point particle in a spherically symmetric 
potential is crucial in the derivation of these 
conditions (Wojtkowski 2005). 


See also: Billiards in Bounded Convex Domains; Ergodic 
Theory; Hamiltonian Systems: Stability and Instability 
Theory; Hyperbolic Dynamical Systems; Polygonal 
Billiards; Random Matrix Theory in Physics. 
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Introduction 
Division of Smooth Dynamical Systems 


Linear maps can be elliptic (complex diagonalizable 
with all eigenvalues on the unit circle), parabolic (all 
eigenvalues on the unit circle but some Jordan blocks 
of size at least 2), or hyperbolic (no eigenvalues on the 
unit circle), and for differentiable dynamical systems, 
that is, smooth maps or flows, one can roughly make 
an analogous subdivision (see Hasselblatt and Katok 
2002, p..100f). The linear maps not covered by these 
alternatives are those with some eigenvalues on the 
unit circle and others off it; the corresponding class of 
*partially hyperbolic" dynamical systems is usually 
considered in the context of hyperbolic dynamical 
systems with a view to studying phenomena wherein 
the hyperbolic behavior dominates. Thus, elliptic 
dynamical systems are more or less similar to 
isometries, with orbit separation constant or at most 
oscillatory but without persistent growth. KAM 
theory deals with elliptic systems, establishing that 
much of the ellipticity in an integrable Hamiltonian 
system persists under perturbation. Parabolic systems 
may have polynomial orbit separation produced by a 
local “shear” phenomenon; billiards in polygonal 
domains are an example of this. Hyperbolic dynamical 
systems are characterized by exponential divergence of 
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orbits. They are of interest because of the complexity 
of their orbit structure with respect to both topological 
and statistical behavior. 

Specifically, the stretching (corresponding to 
eigenvalues outside the unit circle in the case of 
linear maps) combined with the folding necessitated 
by compactness of the phase space produces not 
only highly sensitive dependence of orbit asympto- 
tics on initial conditions, but also a close intertwin- 
ing of different behaviors. On the one hand, there is 
a dense set of periodic points, on the other hand, an 
abundance of dense orbits. While there are only 
finitely many periodic points of a given period, their 
number grows exponentially as a function of the 
period. The entropy of these systems is positive, 
which indicates that the overall complexity of the 
orbit structure grows exponentially as a function of 
the length of time for which it is being tracked. In 
effect, the behavior of orbits is so intricate as to be 
quasirandom, which makes it natural to use statis- 
tical methods to describe these systems. 


History of Hyperbolic Dynamical Systems 


One strand of the history of hyperbolic dynamical 
systems leads back to the question of the stability of 
the solar system and to Poincaré, in whose prize 
memoir on the three-body problem the possibility of 
“homoclinic tangles” first presented itself. For 
Poincaré, this was important because the resulting 
complexity demonstrates that this system is not 
integrable. We describe below how hyperbolic 
dynamics arises in this situation (see Figure 3). 
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Another strand emerged about a decade later with 
Hadamard’s study of geodesic flows (free particle 
motion) on negatively curved surfaces. Hadamard 
noted that these exhibit the kind of sensitive 
dependence on initial conditions as well as the 
pseudorandom behavior that are central features of 
hyperbolic dynamics. This subject was developed 
much further after the advent of ergodic theory, 
with the Boltzmann ergodic hypothesis as an 
important motivation: work by numerous mathe- 
maticians, principally Hedlund and Hopf, showed 
that free particle motion on a negatively curved 
surface provides examples of ergodic mechanical 
systems. More than two decades later, in the 1960s, 
Anosov and Sinai overcame a fundamental technical 
hurdle and established that this is indeed the case in 
arbitrary dimension. This was done in the more 
general context of a class of dynamical systems 
known now as Anosov systems, which were axio- 
matically defined and systematically studied for the 
first time during this period of research in Moscow. 

A greater class of dynamical systems exhibiting 
chaotic behavior was introduced by Smale in his 
seminal 1967 paper under the name of Axiom-A 
systems. This class includes the hyperbolic dynamics 
arising from homoclinic tangles, see Figure 3 
(see Homoclinic Phenomena). Smale’s motivation 
was his program of classifying dynamical systems 
under topological conjugacy, and the consequent 
search for structurally stable systems. Today, Axiom- 
A (and Anosov) systems are valued as idealized models 
of chaos: while the conditions defining Axiom A are 
too stringent to include many real-life examples, it is 
recognized that they have features shared in various 
forms by most chaotic systems. Here, we concentrate 
on the discrete-time context to keep notations lighter. 

Partial hyperbolicity was introduced in the 1970s 
and has proved that a limited amount of hyperbo- 
licity in a dynamical system can produce much of 
the global complexity (such as ergodicity or the 
presence of dense orbits) exhibited by hyperbolic 
systems, and can do so in a robust way. Here one 
imposes uniform conditions, but expansion and 
contraction are not assumed to occur in all direc- 
tions. Stable ergodicity has been an important 
subject of research in the last decade. 

Nonuniform hyperbolicity weakens hyperbolicity 
by allowing the contraction and expansion rates to 
be nonuniform. This was motivated by examples of 
systems with hyperbolicity where expansion or 
contraction can be arbitrarily weak or absent in 
places, such as the Hénon attractor, and by 
situations where hyperbolicity coexists with singula- 
rities, such as for (semi)dispersing billiards (see 
Hyperbolic Billiards). 


With respect to both uniformly and nonuniformly 
hyperbolic systems, dimension theory has been a 
subject of much interest (computations and esti- 
mates of the fractal dimension of attractors and 
hyperbolic sets, which is deeply connected to 
dynamical properties of the system). 

A different weakening of hyperbolicity, the pre- 
sence of a dominated splitting, has been of interest 
from the a viewpoint to stability and classification 
of diffeomorphisms. 

The study of hyperbolic dynamics has always had 
interactions with other sciences and other areas of 
mathematics. In the natural and social sciences, this 
is the study of chaotic motions of just about any 
kind. Examples of applications in related areas of 
mathematics are geometric rigidity (an interaction 
with differential geometry) and rigidity of group 
actions. 


Uniformly Hyperbolic Dynamical Systems 
Definitions 


Let f be a smooth invertible map. A compact 
invariant set of f is said to be “hyperbolic” if at 
every point in this set, the tangent space splits into a 
direct sum of two subspaces E" and E* with the 
property that these subspaces are invariant under the 
differential df, that is, df(x)E"(x)=E"(f(x)), 
df(x)E*(x)— E"(f(x)), and that df expands vectors 
in E" and contracts vectors in Es, that is, there are 
constants 0 < A < 1 < p,c > 0 such that if v € E(x) 
for some x, then ||df" v|| € cA"||v|| for n=1,2,..., 
and if v € E"(x) for some x, then ||df^"v|| € 
cu "vl for s8-1,2,.... 

If E" —(0] in the definition above, then the 
invariant set is made up of attracting fixed points 
or periodic orbits. Similarly, if E5 — (0), then the 
orbits are repelling. If neither subspace is trivial, 
then the behavior is locally *saddle-like," that is to 
say, relative to the orbit of a point x, most nearby 
orbits diverge exponentially fast in both forward 
and backward time. This is why hyperbolicity is a 
mathematical notion of chaos. 

An Anosov diffeomorphism is a smooth invertible 
map of a compact manifold with the property that 
the entire space is a hyperbolic set. 

Axiom A, which is a larger class, focuses on the 
part of the system that is not transient. More 
precisely, a point x in the phase space is said to be 
*nonwandering" if every neighborhood U of x 
contains an orbit that returns to U. A map is said 
to satisfy Axiom A if its nonwandering set is 
hyperbolic and contains a dense set of periodic 
points. 


Definitions in the continuous-time case are analo- 
gous: f above is replaced by the time-t-maps of the 
flow, and the tangent spaces now decompose into 
E" o E? o Es where E}, which is one dimensional, 
represents the direction of the flow lines. 

A geometric way of detecting (indeed, defining) 
hyperbolicity is via the cone criterion: at every point 
there is a cone that is mapped by the differential into 
the interior of the corresponding cone at the image 
point, and a “complementary” cone family behaves 
similarly for the inverse. 

Many continuous structures associated with a 
hyperbolic dynamical system are, in fact, Holder 
continuous. (For a function g on a metric space this 
is defined as the existence of C, o > 0 such that 
d(g(x),g(y) € Cd(x,y)" whenever x,y are suffi- 
ciently close to each other.) In the present article, 
almost every assertion of continuity could be 
replaced by one of Hólder continuity. This notion 
is natural in this context because xy y exponen- 
tially fast implies that g(x,) — g(y) exponentially fast 
if g is Holder continuous. 


Structure and Properties 


Stable and Unstable Manifolds, Local 
Product Structure 


Anosov and Axiom-A systems are defined by the 
behavior of the differential. Corresponding to the 
linear structures left invariant by df are nonlinear 
structures, namely “stable manifolds" tangent to E? 
and *unstable manifolds" tangent to E". 

Thus, associated with an Anosov map are two 
families of invariant manifolds, each one of which 
fills up the entire phase space; they are sometimes 
called the stable and unstable “foliations.” The 
leaves of these foliations are transverse at each 
point, that is, they intersect at positive angles, 
forming a kind of (topological) coordinate system. 
The map f expands distances along the leaves of one 
of these foliations and contracts distances along the 
leaves of the other. For Axiom-A systems, one has a 
similar local product structure or “coordinate 
system" at each point in the nonwandering set, but 
the picture is local, and there are gaps: the stable 
and unstable leaves do not necessarily fill out open 
sets in the phase space. 

There is much interest in determining the fractal 
dimension (box-counting or Hausdorff, say) of 
hyperbolic sets. So far the best dimension estimates 
have been made for stable slices, that is, for the 
intersection of a stable leaf with the hyperbolic set, 
and for unstable slices. Because the local coordinate 
systems describing the local product structure are 


Hyperbolic Dynamical Systems 723 


only known to be continuous, it is not known in 
general whether the sum of these stable and unstable 
dimensions gives the dimension of the hyperbolic set 
(we don't even know whether all stable slices have 
the same fractal dimension). The problem is that an 
a-Holder-continuous map can change dimensions by 
a factor of a or 1/a. But there is evidence to suggest 
that something like this *dimension product struc- 
ture" may often be true - this has been established 
for a class of solenoids. 


Transitivity and Spectral Decomposition 


In addition to these local structures, Axiom-A 
systems have a global structure theorem known 
as “spectral decomposition.” It says that the 
nonwandering set of every Axiom-A map can be 
written as X; U---UX, where the X; are disjoint 
closed invariant sets on which f is topologically 
transitive, that is, has a dense orbit. The X; are 
called “basic sets.” Each X; can be decomposed 
further into a finite union |J X;,;, where each X; ; is 
invariant and topologically mixing under some 
iterate of f. (Topological transitivity and mixing 
are irreducibility conditions; transitivity means that 
there is no proper open invariant subset, and 
topological mixing says that given two open sets, 
from some time onward the images of one will 
always intersect the other.) This decomposition is 
reminiscent of the corresponding result for finite- 
state Markov chains. 


Stability 


One of the reasons why hyperbolic sets are 
important is their “robustness”: they cannot be 
perturbed away. More precisely, let f be a map 
with a hyperbolic set A which is locally maximal, 
that is, it is the largest invariant set in some 
neighborhood U. Then for every map g that is 
Cl-near f, the largest invariant set A’ of g in U 
is again hyperbolic; moreover, f restricted to A is 
*topologically conjugate" to g restricted to A'. This 
is mathematical shorthand for saying that not only 
are the two sets A and A’ topologically indistin- 
guishable, but the orbit structure of f on A is 
indistinguishable from that of g on A’. 

The phenomenon above brings us to the idea of 
“structural stability.” A map f is said to be 
structurally stable if every map gC!-near f is 
topologically conjugate to f (on the entire phase 
space). It turns out that a map is structurally stable 
if and only if it satisfies Axiom A and an additional 
condition called strong transversality. 
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Chains and Shadowing 


We discuss next the idea of pseudo-orbits versus real 
orbits. Letting d(-,-) be the metric, a sequence of 
points xo, X1, X2,... in the phase space is called an 
“e-pseudo-orbit” or a “chain” of f if d(f (xi), xi;1) < € 
for every i. Computer-generated orbits, for example, 
are pseudo-orbits due to round-off errors. A fact of 
consequence to people performing numerical experi- 
ments is that in hyperbolic systems, small errors at 
each step get magnified exponentially fast. For 
example, if the expansion rate is 3 or more, then 
an €-error made at one step is at least tripled at each 
subsequent step, that is, after only O(|loge|) 
iterates, the error is O(1), and the pseudo-orbit 
bears no relation to the real one. There is, however, 
a theorem that says that every pseudo-orbit is 
“shadowed” by a real one. More precisely, given a 
hyperbolic set, there is a constant C such that if 
X0,X1,X2,... is an e-pseudo-orbit, then there is a 
phase point z such that d(x;,f'(z)) < Ce for all i. 
Thus, paradoxical as it may first seem, this result 
asserts that on hyperbolic sets, each pseudo-orbit 
approximates a real orbit, even though it may 
deviate considerably from the one with the same 
initial condition. 

The shadowing orbit corresponding to a bi- 
infinite pseudo-orbit is, in fact, unique. From this, 
one deduces easily the following Closing Lemma: 
For any hyperbolic set, there is a constant C such 
that the following holds: Every finite orbit segment 
x,f(x),...,f" '(x) that nearly closes up, that is, 
d(x, f"! (x)) < £ for some small e, lies within «Ce of 
a genuine periodic orbit of period n. Thus, hyper- 
bolic sets contain many periodic points. 


Examples 
Anosov Diffeomorphisms 


A large class of Anosov diffeomorphisms comes 
from “linear toral automorphisms,” that is, maps of 
the n-dimensional torus induced by n x n matrices 
with integer entries, det = +1, and no eigenvalues of 
modulus one. The most popular example is the map 


obtained from 
2 1 
I- 1 


sometimes called the Arnol'd cat map because of an 
illustration used by Arnol'd. The unstable manifolds 
are lines parallel to the expanding direction shown 
in Figure 1 and wrapped around the torus, and the 
stable manifolds are obtained from the orthogonal 
lines. 


Figure 1 A hyperbolic toral automorphism. Reproduced from 
Katok A and Hasselblatt B (2003) Dynamics: A First Course. 
Cambridge: Cambridge University Press, with permission from 
Cambridge University Press. 


We remark that due to their structural stability, 
(nonlinear) perturbations of linear toral automorph- 
isms continue to have the Anosov property. This 
remark applies also to all of the examples below. In 
fact, all known Anosov diffeomorphisms are topo- 
logically identical to a linear toral automorphism (or 
a slight generalization of these, infranil-manifold 
automorphisms). 


Geodesic Flows 


Geodesic flows describe free motions of points on 
manifolds. Let M be a manifold. Given x € M and a 
unit vector v at x, there is a unique geodesic starting 
from x in the direction v. The geodesic flow y’ is 
given by v'(x,v) — (x', v') where x’ is the point t units 
down the geodesic and v/ is the direction at x’. 
Geodesic flows on manifolds of strictly negative 
curvature are the main examples of Anosov flows. 
They were studied by Hadamard (ca. 1900), 
Hedlund and Hopf (1930s) considerably before 
Anosov theory was developed. 


Horseshoes 


Smale’s horseshoe is the prototypical example of a 
hyperbolic invariant set. This map, so called because 
it bends a rectangle B into the shape of a horseshoe 
and puts it back on top of B, is shown in Figure 2. 
The set {x:f"(x) € B for all »—0, +1, 42,...} is 
hyperbolic. It is a two-dimensional Cantor set in B. 
The emergence of this example can be traced back 
directly to real-world systems. 

During World War II, Cartwright and Littlewood 
worked on relaxation oscillations in radar circuits, 


Figure 2 The horseshoe. 


consciously building on Poincaré’s work. Further 
study of the underlying van der Pol equation by 
Levinson contained the first example of a structu- 
rally stable diffeomorphism with infinitely many 
periodic points. (Structural stability originated in 
1937 but began to flourish only 20 years later.) This 
was brought to the attention of Smale. Inspired by 
Peixoto’s work, who had carried out such a program 
in dimension 2, Smale pursued a program of 
studying diffeomorphisms with a view to classifica- 
tion (Smale 1967). Until alerted by Levinson, Smale 
conjectured that only Morse-Smale systems (which 
have only finitely many periodic points with stable 
and unstable sets in general position) could be 
structurally stable. He eventually extracted the 
horseshoe from Levinson’s work. Smale in turn 
was in contact with the Russian school, where 
Anosov systems (then C- or U-systems) had been 
shown to be structurally stable, and their ergodic 
properties were studied by way of further develop- 
ment of the study of geodesic flows in negative 
curvature. 

The appearance of horseshoes in mathematical 
models of real-world phenomena is quite wide- 
spread. Indeed, in a sense this is the mechanism for 
the production of chaotic behavior, at least in 
dimension 2. In disguise, one of the earliest 
appearances of this phenomenon occurred in the 
prize memoir of Poincaré, where homoclinic tangles 
gave a first glimpse at the serious dynamical 
complexity that can arise in the three-body problem 
in celestial mechanics. If the stable and unstable 
curves of a hyperbolic fixed point intersect trans- 
versely (as in Figure 3a), this engenders further such 
intersections and produces a complicated web of 
accumulations of loops or lobes of stable and 
unstable curves, as shown in Figure 3b. Homoclinic 
tangles always produce horseshoes by the Smale- 
Birkhoff theorem, illustrated by Figure 3c, so in 
trying to solve the three-body problem, Poincaré 
essentially discovered the possibility of nontrivial 
hyperbolic behavior (see Homoclinic Phenomena). 

A related appearance of horseshoes in this context 
is in the work of Alekseev, who used their presence 
to show that capture of celestial bodies can indeed 
occur. 


Solenoids 


Finally we mention the solenoid, which is an 
example of an Axiom-A attractor (see Figure 4). 
Here the map f is defined on a solid torus M = S! x 
D2, where D; is a two-dimensional disk. It is easiest 
to describe it in two steps: first it maps M into a 
long thin solid torus, which is then put inside M 
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(a) (b) 


y ) 
fr | F i 
(c) 


Figure 3 Homoclinic tangles produce horseshoes. Repro- 
duced from Katok A and Hasselblatt B (2003) Dynamics: A 
First Course. Cambridge: Cambridge University Press, with 
permission from Cambridge University Press. 


Figure 4 The solenoid. Reproduced from Katok A and 
Hasselblatt B (2003) Dynamics: A First Course. Cambridge: 
Cambridge University Press, with permission from Cambridge 
University Press. 


winding around the S! direction twice. The attractor 
is given by A= [),,.9f"(M). 


Symbolic Coding of Orbits and 
Ergodic Theory 


An important tool for studying the orbit structure of 
Axiom-A systems is the “Markov partition,” con- 
structed for Anosov systems by Sinai and extended to 
Axiom-A basic sets by Bowen. Given a partition 
(R1,..., Rg} of the phase space, there is a natural 
way to attach to each point x in the phase space a 
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sequence of symbols, namely (...,@-1,40,41,42,...)s 
where a; € (1,2,...,k] is the name of the partition 
element containing f'(x), that is, f'(x) € Rg, for 
each i. In general, not all sequences are realized by 
orbits of f. Markov partitions are designed so that 
the set of symbol sequences that correspond to real 
orbits has Markovian properties; it is called a shift of 
finite type. 

The ergodic theory of Axiom-A systems has its 
origins in statistical mechanics. In a 1D lattice model in 
statistical mechanics, one has an infinite array of sites 
indexed by the integers; at each site, the system can be 
in any one of a finite number of states. Thus, the 
configuration space for a 1D lattice model is the set of 
bi-infinite sequences on a finite alphabet. Identifying 
this symbol space with the one coming from Markov 
partitions, Sinai and Ruelle were able to transport 
some of the basic ideas from statistical mechanics, 
including the notions of Gibbs states and equilibrium 
states, to the ergodic theory of Axiom-A systems. 

The notion of equilibrium states, which is 
equivalent to Gibbs states for Axiom-A systems, 
has the following meaning in dynamical systems in 
general: given a potential function p, an invariant 
measure is said to be an equilibrium state if it 
maximizes the quantity 


bulf) — | odu 


where 5, (f) denotes the Kolmogorov-Sinai entropy of 
f and the supremum is taken over all f-invariant 
probability measures ju. In particular, when y = 0, this 
measure is the measure that maximizes entropy; and 
when y= log|det(df|z.)| it is the Sinai-Ruelle- 
Bowen (SRB) measure. From the physical or observa- 
tional point of view, SRB measures are the most 
important invariant measures for dissipative dynami- 
cal systems because if f is a diffeomorphism of a 
compact manifold M and A a transitive Axiom-A 
attractor with basin U, for example, A = U = M, then 
for Lebesgue-a.e. x € U and for every o € C?(M) 


1 n—1 ; 
din get (x))— | «^ 
that is, Lebesgue-a.e. point is j;-typical. Thus, while 
Axiom-A attractors will have chaotic motions, they are 
statistically coherent in that the asymptotic distribution 
of any typical orbit is given by the SRB measure. 


Periodic Points and Their 
Growth Properties 


We discuss briefly some further results related to the 
abundance of periodic points in Axiom-A systems. 


For an Axiom-A diffeomorphism f, if P(n) is the 
number of periodic points of period <n, then P(n) ~ 
e". where h is the topological entropy of f. That is 
to say, the dynamical complexity of f is reflected in 
its periodic behavior. An analogous result holds for 
Axiom-A flows. This asymptotic behavior is known 
to remarkably fine accuracy (Margulis 2004), and 
these developments used the dynamical zeta func- 
tion, which sums up the periodic information of a 
system. In the discrete-time case, ¢(z):= exp Jona] 
P(n)z" /n has been shown to be a rational function 
analytic on |z| < e^. In the continuous-time case, 
the zeta function is given by  Q(z):— [[, 
(1 = exp(—zl(4))) t, where the product is taken 
over all (nonstationary). periodic orbits y and /(») 
is the smallest positive period of y. This function is 
known to be meromorphic on a certain domain, 
but the location of its poles, which are intimately 
related to correlation decay properties of the 
system, remains one of the yet unresolved issues in 
Axiom-A theory. 


Partial Hyperbolicity and 
Dominated Splitting 


There are various ways in which the notion of 
hyperbolicity described above, which we will hence- 
forth refer to as “uniform hyperbolicity," can be 
extended beyond the one presented so far. This can 
be done with a view to weakening the conditions 
under which some of the salient properties of 
hyperbolic dynamical systems appear. The study of 
partially hyperbolic dynamical systems and that 
of dynamical systems possessing a dominated split- 
ting is of this type. Further below, we describe a 
different extension motivated more by a desire to 
bring the results and methods of hyperbolic 
dynamics to bear on systems that are closer to 
some physical situations. This led to the study of 
nonuniformly hyperbolic dynamical systems. 

If one views hyperbolicity as requiring that the 
spectrum of expansion and contraction rates is 
separated into two components by the unit circle, 
then one can consider systems where this separation 
is provided by a circle centered at 0 whose radius 
may not be 1 (partial hyperbolicity in the broad 
sense), or by two circles centered at 0 of which one 
has radius less than 1 and the other has radius 
greater than 1, with possibly a third component of 
the spectrum in the annulus between these (absolute 
partial hyperbolicity). Further weakenings are 
obtained by controlling not the whole spectrum in 
this absolute way, but rather ratios of expansion and 
contraction rates along orbits (dominated splitting 


and relative partial hyperbolicity, respectively). 
Among the motivations for these weakenings are 
the desire to understand which systems are topolo- 
gically transitive and robustly so (stable transitivity), 
and to understand which ergodic volume-preserving 
systems remain ergodic if perturbed within the space 
of volume-preserving systems (stable ergodicity). 


Pseudohyperbolicity 


Let f be a smooth invertible map. A compact 
invariant set of f is said to be partially hyperbolic 
in the broad sense if at every point in this set, the 
tangent space splits into a direct sum of two 
subspaces E" and E? with the property that these 
subspaces are invariant under the differential df, 
that is, df(x)E"(x) = E"(f(x)), df (x)E*(x) = E"(f(x)), 
and that there are constants 0 < À < j4, c > 0 such 
that if v € E*(x) for some x then ||df"v|| < cA"||v|| 
for n—1,2,... and if v € E(x) for some x 
then fdf "v|| € c "||v| for »—1,2,.... This is 
sometimes also referred to as the existence of a 
(A, u)-splitting or pseudohyperbolicity. 


Dominated Splitting 


A further weakening of this condition replaces these 
absolute estimates by relative ones. Let f be a 
smooth invertible map. A compact invariant set of 
f is said to admit a dominated splitting if at every 
point in this set, the tangent space splits into a direct 
sum of two subspaces E" and E* with the property 
that these subspaces are invariant under the differ- 
ential and there are constants A € (0,1), c > 0 such 
that if 4 € E"(x) and v € E (x) for some x then 
Idf"v||/|df"u|| < cA” for n=1,2,.... 


The presence of a dominated splitting has been. 


found to yield substantial information pertinent to 
stability of such systems, and it plays a significant 
role in a program of research aiming at a classifica- 
tion of generic diffeomorphisms up to topological 
conjugacy and specifically motivated by the “Palis 
conjecture," which aims to describe that classifica- 
tion. With respect to inferring topological and 
ergodic (i.e., statistical) properties of the orbit 
structure, the stricter notion of partial hyperbolicity 
(in the narrow sense below) is more commonly used, 
but in this respect the presence of a dominated 
splitting is also of interest because there is evidence 
in support of the conjecture that stable ergodicity 
implies the presence of a dominated splitting. 


Partial Hyperbolicity 


Let f be a smooth invertible map. A compact 
invariant set of f is said to be (absolutely) partially 
hyperbolic if at every point in this set, the tangent 
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space splits into a direct sum of unstable, central, 
and stable directions E",E*, and Es with the 
property that these subspaces are invariant under 
the differential df and that there exist numbers 
CU, 


0 < A1 € i1 € 22 € p» € A3 € p3 
with 41 < 1 < A3 


[1] 


such that if 
n= 1,2,..., then 


C |] € df" WN < Cor” 
C” lwl] < df" (wl < Cua" lw] 
C As" lu] < Idsf" (WN < Cus" lul 


In this case, we set ES := E* @ ES and E™ :— E* @ E". 
Following Burns-Wilkinson, we say that f is “center- 
bunched” if max (ji, 45!) < A2/pa. 

As in the case of (uniformly) hyperbolic dynami- 
cal systems, the sub-bundles E* and E" are integrable 
to stable and unstable foliations W* and W". It is 
not automatic that the center-stable sub-bundle ES 
and the center-unstable sub-bundle E*" are tangent 
to foliations WS and W*"; if this happens to be the 
case, the partially hyperbolic system is said to be 
*dynamically coherent." 

Partial hyperbolicity can also be defined by a cone 
criterion, with suitable adaptations. 


v € E*(x),w € E(x), u € E"(x), 


Stable Ergodicity and Transitivity 


Partial hyperbolicity was introduced as a means of 
providing just enough hyperbolicity to render a 
dynamical system ergodic or topologically transitive. 
These are both irreducibility conditions, and to 
obtain these, one rules out a Cartesian product 
situation by assuming something like essential 
accessibility: almost every two points (in the sense 
of volume viewed as a measure) can be connected by 
a curve consisting of a finite concatenation of arcs, 
each of which lies entirely in one stable or unstable 
leaf. A celebrated result in this field is in its original 
form (with a much stronger center-bunching 
assumption) due to Pugh and Shub: suppose a 
volume-preserving diffeomorphism is partially 
hyperbolic on the entire manifold. If it is dynami- 
cally coherent and center bunched and has essential 
accessibility, then it is ergodic (Hasselblatt and Pesin 
2006). 

One of the motivating aims of this theory was to 
obtain nonhyperbolic volume-preserving systems 
that are stably ergodic, that is, for which all 
volume-preserving C!-small perturbations are also 
ergodic. If, in addition to the above, one assumes 
that essential accessibility also persists under such 
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perturbations and that the center bundle E° is 
integrable to a center foliation W* that is smooth 
(or *plaque-expansive"), then ergodicity is indeed 
stable (Hasselblatt and Pesin). There are quite a few 
natural examples where these assumptions hold. 

While essential accessibility does not always hold, 
it is fairly common. The stronger property of 
accessibility (that any two points can be connected, 
not only almost every two points) is conjectured to 
be stable under C!-perturbations and has been 
shown to hold for an open dense set of partially 
hyperbolic systems with respect to the C'-topology. 

Ergodicity is a measure-theoretic irreducibility 
notion, and topological transitivity is the topological 
counterpart. It can also be obtained from accessi- 
bility: a partially hyperbolic volume-preserving 
diffeomorphism with the accessibility property is 
topologically transitive (in fact, almost every orbit is 
dense). 

There are interesting converse results as well. Any 
stably transitive diffeomorphism exhibits a domi- 
nated splitting. Moreover, in dimension 2 it is 
hyperbolic and in dimension 3 it is partially 
hyperbolic in the broad sense. 


Nonuniform Hyperbolicity 


Applications have motivated weakening assump- 
tions of uniform hyperbolicity to require only that 
“many” individual orbits exhibit hyperbolic beha- 
vior, without assuming that there are any uniform 
estimates on the degree of hyperbolicity. 

To measure the asymptotic contraction or expan- 
sion of a vector on an exponential scale, one defines 
the Lyapunov exponent of a (nonzero) tangent 
vector v at x for the map f to be 


A(x, v):= lim (4/m)log|Df"(v)] 2] 


whenever this limit exists. Note that being positive 
indicates asymptotic expansion of the vector, 
whereas negative exponents correspond to contract- 
ing vectors. This defines a measurable but, save for 
exceptional circumstances, discontinuous function 
of x and v. It is relatively easy to see that for a given 
point x the function A(x,- ) can only take finitely 
many values, so it is natural to define nonuniform 
hyperbolicity as the property of having all of these 
finitely many values nonzero for “most” points. 
Given that A is measurable, it is natural to define 
“most” by using a measure that is invariant under 
the map f. Therefore, the theory of nonuniformly 
hyperbolic dynamical systems, much of which is due 
to Pesin, is based on measure theory throughout. 


The fundamental fact on which this theory is 
based is the “Oseledets multiplicative ergodic theo- 
rem," which says that for a C!-diffeomorphism of a 
compact Riemannian manifold the set of Lyapunov- 
regular points has full measure with respect to any 
f-invariant Borel probability measure. 

For a Lyapunov-regular point the limit [2] exists 
for all v, so this theorem tells us that no matter 
which invariant measure we consider, the limit [2] 
makes sense for all tangent vectors at points x 
outside a null set. (One should add that this small 
“bad” set can be somewhat substantial; for example, 
its Hausdorff dimension is usually that of the whole 
space.) 

Accordingly, one then defines a measure to be 
hyperbolic if at almost every point the limit [2] is 
nonzero for all vectors. In this case, one says that 
“f has nonzero Lyapunov exponents.” This property 
can also be obtained from a cone criterion, but here 
the family of cones may only be invariant and 
eventually strictly invariant, that is, there is a cone 
field such that cones are mapped to cones (but not 
necessarily into the interior of cones), and for almost 
every point there is an iterate that maps a cone 
strictly inside the cone at the image point (i.e., into 
the interior). Which iterate is needed is allowed to 
depend on the point (see Hyperbolic Billiards). 

It is good to keep in mind that a hyperbolic 
measure may be concentrated on a single point, say, 
in which case there is not much gained by this 
approach. The theory is of great interest, however, if 
the measure is equivalent to volume or is the 
“physical measure” on an attractor. 

Examples of this sort are fairly common, indeed 
any smooth compact Riemannian manifold other 
than the unit circle admits a volume-preserving 
Bernoulli diffeomorphism with nonzero Lyapunov 
exponents (Dolgopyat and Pesin 2002) (and every 
compact smooth Riemannian manifold of dimension 
at least 3 carries a volume-preserving Bernoulli flow 
for which at almost every point the only zero 
Lyapunov exponent is the one in the flow direction 
(Hu et al. 2004)). 

Structurally, these systems exhibit many of the 
features seen in uniformly hyperbolic ones (e.g., 
stable manifolds), but instead of being continuous 
these are now measurable. There are, however, 
(noninvariant) sets of arbitrarily large measure on 
which these structures are continuous. This provides 
a handle for pushing some of the uniform theory to 
this context. 

There are some topological results in this area, of 
which one of the more remarkable ones is that any 
surface diffeomorphism with positive entropy con- 
tains a horseshoe. Much of the current research is 


directed at the ergodic theory of these systems. A 
central result from the initial development of the 
theory is that while these systems may not be 
ergodic, the ergodic components are (a.e. equal to) 
open sets, so in particular there are at most 
countably many of them. 

One natural question is whether nonuniformly 
hyperbolic systems have SRB measures, and it is 
answered on a case-by-case basis. There are even 
benign examples where this fails to be the case, but 
for some realistic systems, such as the Lorenz and 
Hénon attractors, this has been established. 

Because they preserve volume, this is not an issue 
for billiard systems, (see Hyperbolic Billiards), that 
is, the free motion of a point mass in a cavity with 
elastic boundary collisions. This describes not just a 
toy model, but also the phase space and dynamics of 
a gas of convex rigid bodies. Such a gas of hard 
spheres in a rectangular box is semidispersing and 
has been studied intensely. It is now known to be 
hyperbolic and hoped to be ergodic. (The latter 
would provide a solid foundation for statistical 
mechanics, at least for the case of spherical 
molecules.) A gas of nonspherical convex rigid 
bodies is also a point billiard, but it is not 
semidispersing, which puts it beyond the range of 
readily available techniques for establishing 
ergodicity. 


Further Remarks 


The historical remarks made here are significantly 
expanded in Hasselblatt (2002), which contains 
some references to yet more detailed sources as 
well as more detail about uniformly hyperbolic 
dynamical systems in a concise form. A concise but 
reasonably comprehensive and current account of 
partially hyperbolic dynamics is in Hasselblatt and 
Pesin, and an authoritative full presentation is in 
Pesin (2004). A survey of nonuniformly hyperbolic 
dynamics is given in Barreira and Pesin (2006), and 
the definitive treatment is given by Barreira et al.. A 
textbook presentation of (not only) hyperbolic 
dynamics is in Katok and Hasselblatt (1995) as 
well as Hasselblatt and Katok (2003), and much 
current research, including on all subjects discussed 
here, is surveyed in Handbook. 
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界 著 名 数学 家 、 哲 学 家 .逻辑 学 家 弗 雷 格 曾 给 出 了 一 个 著名 等 式 : 半 个 数学 
家 十 半 个 哲学 家 二 好 的 哲学 家 十 好 的 数学 家 . 他 解释 说 :“ 一 个 好 的 数学 家 ,至 
少 是 半 个 哲学 家 5 一 个 好 的 哲学 家 ,至 少 是 半 个 数学 家 .” 
本 书 的 目的 就 是 要 用 物理 学 家 替换 上 述 等 式 中 的 哲学 家 . 
举 两 个 刚刚 读 到 的 例子 ,从 中 可 见 物 理学 家 对 数学 也 会 有 贡献 . 物理 学 家 李 政 道 和 杨振宁 在 研 
究 统计 力学 的 一 个 问题 时 , 遇 到 了 一 类 特殊 的 多 项 式 


P(z) = Shae! 


j=0 


的 集合 o. 他 们 能 够 分 析出 ,多 中 的 任意 一 个 多 项 式 忆 的 所 有 根 都 位 于 复 平面 的 单位 圆周 {zx: |z|== 
1} 上 .因此 他 们 猜测 这 个 结论 对 PP NPS TK P 都 成 立 . 如 果 他 们 可 以 找到 一 个 酉 矩阵 U 使 得 
P(z) 是 U 的 特征 多 项 式 , 即 P(x) 二 det(zI 一 U) ,那么 猜想 就 证 明了 .这 是 任何 一 个 学 过 高 等 数学 的 
人 都 会 想到 的 办 法 ,但 这 个 方法 在 此 不 管用 . 杨 和 李 有 很 好 的 数学 功底 ,因此 找到 一 个 证 明 , 但 这 个 
证 明 并 不 简单 .现在 有 更 容易 的 证 明了 ,这 要 特别 归功 于 浅野 太朗 (Taro Asano). 为 证 明 杨 一 李 单 位 
圆 定 理 ( 将 在 下 面 陈述 ), 我 们 需要 将 单 变量 z 的 mm 次 多 项 式 P FHA MAB zi enim 的 多 项 式 
Q ,.z,).Q(zi cz ACT ARE RA xi 都 是 一 次 的 . 我 们 感 兴趣 的 是 这 样 一 类 多 项 式 
Q(z, .° 2, ) 的 集合 Q: KR Ble|<le--.le2e,|/<1 RA Q(z,.°.2,) FO. 因此 ,如 果 Plz) = 
Qz) HÆ QP, N P HARE 38 16| 1. (在 我 们 感 兴趣 的 情况 下 ,存在 一 个 对 称 z->z“，, 因 此 
te e 11.4 88 [8] — 1. ) 很 明显 ,如 果 QC ee pet) Qua tt szmts) 在 QQ 中 , 则 
RRY 4*** 5, Meus: y 59 y Rara) 
也 在 Q 中 .我 们 现在 描述 一 个 不 那么 显然 的 运算 , 称 之 为 浅野 缩 并 , 它 将 Q 中 的 多 项 式 变 为 Q@Q 中 的 
多 项 式 . i, 
Q(z; st sm) = Az;z, Bz; - Cz, t D 
EP A,B,C,D 是 变量 zi ,… ,zs 中 除去 zj ,zt 之 外 的 其 余 m 一 2 个 变量 的 多 项 式 ,浅野 缩 并 将 两 个 
变量 ze 替换 为 一 个 单独 的 变量 zi ,使 得 
Az;z,-FBz;-Cz,--D-*-Az,--D 
AK —^h m 35 $3 XQ 出 发 ,经 过 一 次 浅野 缩 并 ,我 们 得 到 一 个 mx 一 1 元 多 项 式 , 如 果 原 来 的 多 项 式 
在 QQ@ 中 , 则 所 得 的 新 的 多 项 式 也 在 Q@ 中 .( 这 是 一 个 简单 的 练习 ;Azj 十 DD 的 根 是 Az* 十 (B 十 C)z 十 
D 的 两 根 之 积 的 相反 数 . ) 可 以 验证 ,如 果 一 1] 委 onr 委 1, 则 两 个 变量 zz, 的 形 如 
| 
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的 多 项 式 也 在 QQ 中. ( 令 多 项 式 等 于 零 , 则 得 到 一 个 映射 zj 一 ,该 映射 是 一 个 对 合 , 并 且 将 单位 圆 的 内 部 映射 到 单位 圆 的 
外 部 . ) 将 这 些 多 项 式 相 继 相 乘 , 当 同一 个 变量 出 现 两 次 时 做 一 次 浅野 缩 并 ,最 后 令 所 有 的 变量 都 等 于 x, 则 我 们 得 到 杨 一 
李 单 位 圆 定 理 : 对 于 实数 dj = ay ~ lay Sl, 10 x 


P(z)= M) x* [TD IIa, C) 


XC mm) JEX k& X 


的 所 有 根 都 位 于 单位 圆周 上 中 . 

再 比如 物理 学 家 张 宗 煤 . 张 宗 烧 步 入 量子 场 论 研 究 领 域 , 主 要 受到 玻 尔 CN. Bohr) 的 影响 . 从 两 人 的 通信 中 ,可 以 看 出 
张 宗 焰 对 理论 研究 的 偏好 . 而 在 理论 研究 中 , 张 宗 炮 又 有 明显 的 数学 倾向 .其 研究 特点 为 :数学 技巧 强 , 善 于 应 用 数学 解析 
物理 理论 问题 .在 物理 研究 中 ,他 主张 多 做 群 论 和 对 称 性 的 工作 .其 研究 成 果 中 数学 计算 和 表达 都 相当 “清楚 和 干脆、 可靠”， 
结论 简明 准确 , 在 《数学 译 林 》 为 田 方 增 先生 百 岁 诞辰 的 贺信 中 就 提 到 : 泛 函 分 析 学 科 在 中 国 科 学 院 数学 研究 所 几乎 一 开始 
就 是 基础 理论 与 应 用 并 重地 发 展 . 按 科 学 规划 的 精神 ,从 1958 年 起 数学 所 泛 函 分 析 学 科 强 调 其 发 展 要 侧重 于 与 微分 方程 、 
物理 学 、 高 尖 科 技 和 国民 经 济 建设 之 联系 .为 此 , 田 方 增 、 关 营 直 常 与 吴 新 谋 、 IK os Wk SE HE » 使 数学 所 内 泛 函 分 析 的 发 展 始 

终 注意 与 微分 方程 及 现代 数学 物理 的 联系 ,先后 组 织 了 量子 场 理论 、 粒子 迁移 理论 和 电磁 波 理论 中 数学 问题 之 研究 等 学 术 
HH. 他 所 写 的 学 术 论文 为 发 展 中 国 在 这 一 领域 的 数学 研究 做 出 了 重要 贡献 . 田 方 增 与 关 掌 直 一 起 成 功 地 在 中 国 开辟 了 
应 用 泛 函 分 析 的 一 个 重要 领域 一 一 粒子 迁移 理论 的 数学 基础 及 问题 的 研究 ， 

所 以 说 数学 和 物理 互 易 性 强 , 一 些 数 学 家 后 来 成 了 物理 学 家 (例如 戴 森 (Freeman Dyson)) ,而 另 一 些 人 正好 相反 (例如 
钱 德 拉 (Harish Chandra)、 博 特 (Roul Bott)) ,他 们 从 物理 学 家 变 成 了 数学 家 . 最 夺 张 的 莫 过 于 威 腾 (Edward Witten, 
1951— — 2.1990 年 获得 菲 尔 兹 奖 的 理论 物理 学 家 威 腾 于 1976 年 在 普林斯顿 大 学 在 诺 贝 尔 奖 得 主 (2004) 格 罗斯 (David 
Gross) 的 指导 下 获得 物理 学 博士 学 位 ;但 他 从 未 获得 过 数学 博士 学 位 . 

那么 学 习 物 理 到 底 应 该 掌握 多 少数 学 呢 ? 

一 位 致力 于 学 习 理论 物理 的 学 生 曾 请 教 赫 柏林 院士 怎样 治学 . 赫 先 生 说 :" 要 想 搞 理论 物理 ,首先 数学 要 好 .前 两 年 先 
把 斯 米尔 诺 夫 的 五 卷 及 变 分 学 .微分 几何 、 数理 方法 .拓扑 和 积分 等 学 完 , 然 后 开始 进入 近代 数学 ,要 学 流 形 、 群 .连续 群 、 李 
群 、 现 代 微 分 几何 等 .” 

当然 这 只 是 入 门 级 的 数学 ， 

本 套 从 书 狗 似 物理 实则 充斥 着 现代 数学 ,正如 中 国 科 学 院 理论 物理 研究 所 吴 岳 良 研究 员 所 评介 的 那样 : 

本 书 物理 学 部 分 与 数学 部 分 的 关系 很 难 分 开 . 实际 上 ,经 典 力 学 ,电磁 学 、 统 计 力 学 、 量 子 力学 流体 力学 、 可 积 系统 和 
动力 系统 中 的 许多 物理 问题 可 归结 为 求解 数学 上 的 常 微分 方程 、 偏 微分 方程 .积分 方程 .微分 积分 方程 等 数学 物理 方程 , 物 
理学 问题 的 解 会 涉及 复 变 函数 和 特殊 函数 等 多 种 函数 ,在 求解 时 又 会 用 到 变 分 技术 、 调 和 分 析 、 泛 函 分 析 等 各 种 数学 分 析 
方法 .同时 ,对 爱 因 斯 担 狭 义 相 对 论 和 广义 相对 论 , 它 不 仅 改 变 了 人 们 的 时 空 观 , 还 使 得 闵可夫 斯 基 时 空 的 几何 学 和 黎 曼 空 
间 的 几何 学 成 为 物理 理论 的 数学 基础 ,同时 也 使 得 向 量 分 析 、 张 量 分 析 和 微分 几何 等 成 为 必要 的 数学 分 析 工 具 . 在 量子 力 
学 中 ,物理 量 成 为 算 子 ,物理 状态 用 波 函 数 来 描述 , 算 子 的 谱 才 是 测量 到 的 物理 量 , 在 量子 场 论 中 , 波 函 数 又 被 二 次 量子 化 
成 为 算 子 用 来 描述 基本 粒子 在 相互 作用 过 程 中 的 产生 和 潭 灭 . 这 使 得 算 子 代数 、 量 子 化 方法 和 路 径 积分 等 数学 理论 和 方法 
成 为 量子 物理 的 数学 基础 . 粒子 物理 学 家 发 现 自然 界 的 3 种 基本 作用 力 : 电 磁 相 互 作用 、. 弱 相互 作用 和 强 相 互 作用 可 用 规 
范 理论 来 描述 ,并 完全 由 规范 对 称 性 来 支配 ,这 些 对 称 性 在 数学 上 用 李 群 和 李 代 数 来 描写 . 事实 上 ,晶体 的 结构 也 是 由 欧 几 
里 得 空间 中 的 转动 群 来 措 述 ,这 使 得 群 论 在 物理 学 中 的 应 用 ,尤其 在 粒子 物理 中 的 应 用 变 得 越 来 越 重 要 . 在 规范 理论 中 , 规 
范 势 当 作 基 本 的 量子 场 , 而 它 被 发 现 就 是 数学 家 在 现代 微分 儿 何 学 中 所 研究 的 纤维 从 上 的 联络 ,这 使 得 有 关 纤 维 丛 的 拓扑 
不 变量 在 粒子 物理 和 量子 场 论 研 究 中 变 得 重要 起 来 ,如 规范 场 的 磁 单 极 子 和 了 杉 子 解 及 手 征 量 子 反 常 等 .在 量子 引力 和 超 终 
理论 的 研究 中 ,不 仅 运 用 到 已 有 的 数学 理论 和 方法 , 尤其 是 现代 数学 , 还 促进 了 数学 理论 本 身 的 发 展 . 同样 ,在 凝聚 态 物质 
和 光学 方面 ,物质 的 拓扑 相 和 拓扑 缺陷 ,拓扑 量 子 计算 等 也 应 用 到 了 许多 现代 数学 方法 ,这 使 得 代数 拓扑 、 代 数 方法 、 量 子 
群 复 几 何 、 辛 儿 何 与 拓扑 、 低 维 几 何 、 非 交换 几何 等 数学 理论 和 数学 方法 越 来 越 多 地 渗透 到 理论 物理 的 研究 中 .另外 ,在 研 
究 微 观 物 理 对 象 的 随机 性 和 各 种 随机 过 程 的 统计 规律 .无 序 系 统 和 动力 系统 时 ,随机 方法 和 离散 数学 等 也 得 到 越 来 越 广泛 


(D 见 杨 振 宁 , 李 政 道 “Statistical Theory of Equations of State and Phase transition. I. Lattice Gas and Ising Model”, dian Rev. 
(2) 87(1952).410-419; t£, L T. Asano. “Theorems on the partition functions of the Heisenberg feeromagnets”, J. Phys. Soc. Japan. 29 
(1970) ,350-359. 长 期 以 来 我 都 为 杨 一 李 单 位 圆 定 理 着 迷 ( 见 D. Ruelle, “Extension of the Lee-Yang circle theorem". Phys. Rev. Lett. 26 
(1971) ,303-304) ,而 且 我 认为 在 这 个 领域 仍然 有 未 被 揭示 出 的 神秘 . (2010 年 , 吕 埃 勒 再 次 发 表 了 一 篇 关于 杨 一 李 单 位 圆 定 理 的 文章 , 见 
Characterization of Lee-Yang polynomials. Annals of Mathematics, 171(2010) .589-603. 一 一 译 者 注 . ) 
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的 应 用 . 

数学 对 物理 的 影响 有 多 大 ? 

正如 本 书 前 言 中 所 写 ， 

当然 ,数学 是 确实 存在 的 . 事实 上 ,从 某 种 角度 而 言 , 物 理学 是 由 精确 的 数学 思 辑 所 操控 的 : 古 希 腊 人 把 空间 几何 结构 
变 成 了 一 种 真实 的 艺术 形式 . 就 我 所 知 ,十 希腊 人 是 “数学 物理 ”的 第 一 个 践 行者 ,他 们 引入 了 坐标 轴 的 概念 ,从 而 把 空间 几 
何 的 所 有 量 都 转化 为 一 些 简 单 的 数字 .今天 ,这 些 被 称 作 “物理 学 的 基本 定律 ”, 直到 很 久 以 后 我 们 才 认 识 到 如 下 事实 :时 间 
流 可 以 类 似 地 被 坐标 化 , 它 连 同 空间 一 起 ,同样 可 用 几何 方法 来 解决 . 于 是 ;有 一 些 疯狂 的 人 对 数字 的 魔力 很 感 兴 趣 , 但 是 ， 
我 们 的 现实 世界 似乎 确实 包含 许多 超出 我 们 分 析 能 力 的 地 方 ，. 

渐渐 地 ,所 有 这 一 切 都 变 了 . 月亮 和 其 他 行星 的 运动 好 像 都 满足 几何 定律 . 伽利略 和 牛顿 设法 去 发 现 这 些 运动 的 合 平 
馆 辑 的 定律 ,并 注意 到 质量 的 概念 也 适用 于 太空 中 的 物体 ,就 像 地 球 上 的 便 果 和 大 炮 一 样 ,这 使 得 太空 更 容易 被 我 们 所 理 
解 . 同时 人 们 发 现 ,电子 ,磁场 、 光 和 声音 也 完全 按照 数学 方程 在 运转 . 

科学 家 认为 :开展 对 “数学 物理 ”的 深入 研究 ,有 助 于 揭示 出 物理 学 与 数学 之 间 的 内 在 联系 .事实 上 ,从 自然 哲学 发 展 到 
物理 学 ,除了 使 用 实验 手段 和 新 的 思维 方法 ,数学 起 了 不 可 替代 的 作用 . 当 人 们 通过 分 析 大 量 实验 数据 和 吸取 各 种 唯 象 理 
论 的 精髓 ,以 严格 的 数学 语言 和 简洁 的 数学 公式 描述 支配 物质 基本 结构 和 宇 害 演化 的 物理 规律 时 ,物理 学 的 简洁 美 、 统 一 
美 、 对 称 与 不 对 称 美 则 通过 深刻 的 数学 美 反 映 出 来 . 可 以 说 ,自从 物理 学 成 为 自然 科学 的 一 门 独立 学 科 后 ,物理 学 与 数学 之 
间 的 关系 变 得 密 不 可 分 . 古代 的 许多 科学 家 既是 数学 家 也 是 物理 学 家 ,尤其 到 了 近代 和 现代 ,许多 理论 物理 学 家 对 数学 的 
运用 和 发 展 起 到 了 更 为 积极 的 推进 作用 ,数学 家 和 理论 物理 学 家 之 间 的 合作 也 变 得 越 来 越 频 繁 、 越 来 越 深 入 ,他 们 成 为 了 
“数学 物理 ”的 践 行者 . 大 家 最 为 熟知 的 十 希腊 的 阿 基 米 德 ,他 既是 著名 的 数学 家 也 是 著名 的 物理 学 家 ,他 很 早 就 利用 数学 
这 个 工具 证 明了 杠杆 原理 和 浮力 原理 ,并 做 了 大 量 的 实验 . 牛顿 在 研究 物体 和 天 体 的 运动 规律 时 发 展 出 新 的 数学 方法 一 一 
微 积分 . 爱 因 斯 坦 则 运用 对 当时 的 物理 学 家 来 说 全 新 的 数学 方法 一 一 微分 几何 和 黎 曼 几何 ,创立 了 广义 相对 论 . i 
曾 回忆 说 :“1912 年 我 突然 认识 到 ,高 斯 的 曲面 理论 是 解 开 这 个 秘密 的 钥匙 ,他 的 曲面 坐标 系 意义 重大 ,不 过 ,当时 我 还 不 
知道 黎 曼 已 经 更 深入 地 研究 了 几何 基础 .我 突然 想起 , 读 大 学 时 盖 泽 先生 给 我 们 上 的 几何 就 包括 高 斯 理论 …… 我 认识 到 几 
何 基础 具有 物理 学 意义 . 当 我 从 布拉格 回 到 苏黎世 时 ,我 亲爱 的 朋友 、 数 学 家 格 罗斯 曼 也 在 苏黎世 . 他 告诉 了 我 高 斯 ,然后 
ERE BPH SHED ERED Rw.” 

A Wy JL fe RE - Har BOK 1844 年 发 表 的 《Lineale Ausdehnungslehre) (( E 45 Hib). 3k zk B f& x db, 5 Np ty 
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书 和 文章 中 出 现 了 一 系列 类 似 的 思想 之 后 , 才 认 识 到 这 些 思想 出 自 格 拉 斯 曼 的 书 , 不 过 为 时 已 晚 . 如 果 你 想 领 略 一 下 这 种 
抽象 的 笔法 ,你 只 要 看 一 下 这 本 书 里 的 某 几 章 的 标题 ,如 :“ 纯 数学 之 概念 之 导出 ”“ 延 拓 理 论 之 推导 ” 延 拓 理论 之 叙述 ”“ 表 
示 之 形式 ”一 般 形式 理论 之 概述 ”. 你 只 有 费劲 地 钻 通 了 这 些 内 容 之 后 才 接 触 到 所 述 内 容 的 纯 抽 象 的 表示 ,不 过 仍然 很 难 
读 懂 ,直到 1862 年 该 书 出 版 了 后 期 的 修订 本 加 ,格拉 斯 曼 才 用 了 一 种 比较 容易 接受 的 表示 法 , 即 坐 标 表 示 法 .此 外 ,格拉 斯 
曼 选 了 一 个 词 Ausdehnungslehre( 延 拓 论 ), 用 以 上 暗示 他 的 研究 可 应 用 于 任意 维 空 间 , 而 几何 学 对 他 而 言 只 不 过 是 这 个 
完全 抽象 的 新 学 科 在 普通 三 维 空间 中 的 应 用 . 但 是 他 造 的 这 个 新 词 并 没有 生根 ,人 们 现今 简称 为 “n BLE”. 

我 们 普通 读者 可 能 易 将 数学 物理 与 数学 物理 方程 相 混淆 ,其 实 这 是 两 个 内 涵 和 外 延 都 不 同 的 概念 ,后 者 只 能 视 为 前 者 
的 一 个 真子 集 ,而 前 者 不 论 从 内 容 上 还 是 所 涵盖 的 范围 都 远 远 超过 了 后 者 ,但 有 一 点 共同 之 处 是 它们 的 问题 都 源 自 于 物 
理 , 但 解决 都 来 自 于 数学 家 . 比如 过 利克 雷 猜 想 的 解决 ,“ 过 利克 雷 原理 ”这 一 数学 猜想 自 提出 之 日 起 ,历经 了 三 十 多 年 的 激 
烈 论争 和 反复 ,最 终 才 被 确立 ,这 是 迪 利 克 雷 在 研究 微分 方程 位 势 原 理 时 提出 的 一 个 猜想 ,其 具体 内 容 简单 地 说 大 体 是 ; 极 
小 化 迪 利 克 雷 积分 


的 函数 ,满足 位 势 方程 


(D 海曙， 格拉 斯 曼 ,《 延 拓 理 论 ) 出 版 于 1844 年 菜 比 锡 . 并 可 参阅 其 Gesammelte mathematische und physikalische Werke, 1 卷 ， 
莱比锡 ,1894 年 ,第 二 版 出 版 于 1898 年 莱比锡 . 
所 ”柏林 ,1862 4E. 见 其 著作 集 第 1 卷 第 二 部 分 ,莱比锡 ,1896 4E. 
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后 来 有 人 在 研究 三 维 位 势 方 程 ( 亦 称 拉 普 拉 斯 方程 或 调和 方程 ) 
Ju, du, Puy 
or! dy de | 


时 ,又 提出 ,由 位 势 方程 所 描述 的 相应 物理 状态 总 有 一 个 确定 的 物理 解 ,因而 其 本 身 也 必然 存在 一 个 数学 解 ,但 在 数学 上 的 
这 种 存在 性 ,长 时 间 的 不 能 被 证 明 , 直 到 1851 年 , 黎 曼 才 在 他 的 博士 论文 < 单 复 变 函 数 一 般 理论 的 基础 "中 ,给 出 了 位 势 方 
程 边 界 问 题解 的 存在 性 证 明 . 由 于 黎 曼 在 文中 运用 了 他 的 老师 迪 利 克 雷 所 提出 的 上 述 猜想 , 故 他 称 之 为 “ 迪 利 克 雷 原理 ” 
可 是 ,在 其 论文 发 表 后 的 不 长 时 间 , 这 个 原理 便 激 起 了 热烈 的 讨论 ,特别 是 黎 曼 的 这 一 证 明 受 到 了 德国 著名 数学 家 魏 尔 斯 
Ae dr 3E CK. W. Weierstrass,1815 一 1897) 的 尖锐 批评 ,他 指出 : 黎 曼 不 加 证 明 就 先 验 地 假定 一 定 会 存在 一 个 使 积分 取得 到 极 
小 值 的 函数 ,这 在 数学 上 是 不 允许 的 ,尽管 受到 了 大 师 的 批评 , 黎 曼 并 没有 因此 动摇 自己 对 迪 利 克 雷 原理 的 信心 * 并 且 一 鼓 
作 气 又 运用 此 原理 作出 了 一 系列 重要 的 发 现 . 1866 年 , 黎 曼 英 年 早 逝 ,但 关于 迪 利 克 雷 原理 是 否 成 立 的 争论 仍 未 停止 . 
1870 年 , 魏 尔 斯 特 拉 斯 给 出 了 一 个 与 迪 利 克 雷 原理 相反 的 例子 ,在 这 个 例子 中 ,对 给 定 的 边 输 条 件 ,使 迪 利 克 雷 积分 达到 
极 小 值 的 函数 是 不 存在 的 ,并 以 此 来 否定 迪 利 克 雷 原理 .由 于 迪 利 克 雷 原理 被 当时 的 数学 权威 魏 尔 斯 特 拉 斯 所 和 否定 ,所 以 
数学 家 们 只 好 另辟蹊径 来 证 明 位 势 方 程 边 界 问题 解 的 存在 性 ,比较 著名 的 有 三 种 证 法 ,1870 年 纽曼 用 "算术 平均 值 法 "给 
出 了 一 个 证 明 ;1890 年 , 许 瓦 效用 “交替 法 ”又 给 出 了 一 个 证 明 , 同 年 , 斋 加 莱 用 “ 扫 散 法 "也 给 出 了 一 个 证 明 . ix 95 up 9] JA E 
辑 上 讲 无 疑 都 是 对 的 ,但 就 是 没有 一 个 能 够 像 以 迪 利 克 雷 原理 为 工具 那样 简单 .明快 ,这 又 不 禁 使 得 数学 家 们 怀念 起 “过 利 
克 雷 原理 "来 ,都 对 它 当 年 被 否定 而 感到 忱 惜 ,并 随 之 产生 了 复活 这 一 原理 的 念头 ,并 且 也 为 之 做 出 了 一 些 努 力 , 只 可 异 都 
未 能 成 功 , 数 学 界 为 此 弥漫 着 一 种 翡 观 的 气氛 ,数学 家 纽曼 就 表示 :如 此 优美 而 又 有 如 此 广阔 应 用 前 景 的 迪 利 克 雷 原理 ,已 
经 从 我 们 的 视线 中 * 永 远 消失 ” 掉 了 4! | 

俗话 说 “三 十 年 河东 ,三 十 年 河西 ?, 就 在 迪 利 克 雷 原理 被 否定 三 十 年 之 后 , 即 1899 年 ,德国 领袖 数学 家 希 尔 伯 特 对 此 
又 发 动 了 一 场 新 的 “救亡 运动 ”. 他 彻底 冲破 了 那 种 把 严格 性 与 简单 性 对 立 起 来 的 传统 观念 ,批判 了 魏 尔 斯 特 拉 斯 以 严格 性 
全 盘 否 定 迪 利克 雷 原理 的 做 法 ,从 过 利克 雷 原 理 的 简单 性 优美 性 以 及 应 用 的 有 效 性 出 发 ,积极 寻求 它 的 真实 性 和 合理 性 ， 
最 后 终于 找到 了 证 明 迪 利克 雷 原 理 的 途径 和 方法 ,他 在 德国 数学 联合 会 上 报告 了 他 的 这 一 研究 成 果 , 并 明确 指出 :只 要 对 
问题 中 的 区 域 `. 边 界 值 和 允许 函数 的 性 质 作 适当 的 限制 ,就 完全 可 以 恢复 迪 利 克 雷 原理 的 真实 性 . 他 还 针对 数学 家 们 认为 
过 利 克 雷 原理 早已 沉没 了 的 观点 ,意味 深长 地 将 他 的 这 一 研究 工作 称 为 “ 迪 利 克 雷 原理 的 复活 ”. 后 来 希 尔 伯 特 又 给 出 一 个 
更 为 一 般 的 证 明 , 从 而 进一步 肯定 了 迪 利 克 雷 原理 存在 的 合理 性 . 

及 至 近代 更 多 源 自 于 物理 的 数学 理论 被 抽象 出 来 ,而 对 这 些 数学 理论 的 进一步 研究 又 极 大 地 推动 了 物理 学 的 进展 ,如 
Yang-Mills 规范 场 的 大 范围 整体 性 质 和 手 征 量子 反常 与 纤维 从 的 拓扑 不 变量 和 Chern-Simons 示 性 类 及 指标 定理 之 间 建 
立 起 直接 的 联系 , 超 蓄 理论 中 的 额外 维 空间 与 Calabi-Yan 空间 之 间 的 对 应 关系 . 理论 物理 学 家 威 腾 在 发 展 超 弦 理 论 的 同时 
由 于 对 数学 的 杰出 贡献 而 获得 菲 尔 兹 奖 , 这 些 都 是 物理 学 与 数学 相互 结合 所 呈现 在 “数学 物理 "方面 的 经 典 例 子 . 

对 此 我 国 数学 工作 者 旱 有 清醒 的 认识 ,20 世纪 80 年 代 李 大 潜 就 撰文 指出 ,学 数学 的 追求 纯 而 又 纯 的 境界 ,即使 从 纯 
数学 的 发 展 来 说 ,也 不 见得 是 一 条 康 庄 大 道 . 不 重视 实际 的 需要 和 其 他 领域 的 发 展 , 没 有 广阔 的 视野 ,是 很 难 出 第 一 流 的 基 
础 理论 人 才 的 . 

基础 和 应 用 有 着 密切 的 关系 ,而 且 相 互 促进 . 搞 基 础 理论 的 人 重视 应 用 方面 的 教育 和 训练 ,对 基础 理论 和 应 用 的 研究 
会 带 来 很 大 的 促进 . 物理 学 中 的 规范 场 和 数学 上 的 纤维 丛 概 念 有 密切 的 联系 . 据 杨 振 宁 教授 自己 讲 , 他 在 美国 请 教 了 很 多 
纤维 丛 方 面 的 数学 家 ,但 他 们 讲 的 一 套 , 他 听 不 懂 ,双方 始终 谈 不 到 一 起 去 .只 有 到 了 复旦 大 学 , 听 谷 超 豪 教授 用 物理 学 家 
可 以 接受 的 语言 * 把 这 二 者 的 关系 讲 得 很 清楚 ,杨振宁 教授 很 高 兴 , 并 和 谷 超 豪 教 授 合 作 ,在 规范 场 的 数学 理论 方面 做 出 很 
多 成 绩 ,把 这 方面 的 理论 进一步 发 展 了 .为 什么 能 这 样 呢 ? 谷 超 豪 教授 在 念 大 学 时 ,就 选修 了 物理 系 四 大 力学 的 课程 . 作为 
一 个 数学 家 ,他 不 仅 在 数学 上 有 很 高 的 造 谐 ,而 且 在 物理 学 方面 也 有 很 好 的 修养 . 

从 本 书 的 目录 我 们 可 以 看 出 它 包含 了 相当 全 面 的 数学 内 容 . 它们 分 别 是 :数学 物理 学 导言 ,经 典 力学 、 流 体 动力 学 、 可 
积 系统 、 经 典 场 论 、 共 形 与 拓扑 场 论 、. 量 子 场 论 . 广 义 相 对 论 、 量 子 引 力 、 弦 论 与 M- 理 论 . 凝 聚 态 物质 与 光学 ,量子 信息 与 量 
子 计算 、 量 子 力学 、 无 序 系统 动力 系统 .平衡 态 统 计 力学 和 非 平衡 态 统计 力学 ,代数 技 巧 、 李 群 和 李 代 数 、 离 散 数 学 ,量子 
群 \, 随 机 方法 、 复 几何 ,微分 几何 \ 低 维 几 何 、 非 交换 几何 代数 拓扑 、 辛 几何 与 拓扑 、 常 微分 和 偏 微分 方程 . 泛 函 分 析 和 和 算 子 
代数 .量子 化 方法 各 路径 积分 、 变 分 技术 . 

本 书 的 三 位 主编 在 序言 中 写 道 ;“ 数 学 物理 把 数学 和 物理 学 这 两 大 学 科 的 优势 集中 到 一 起 ,它们 的 关系 是 共同 发 展 .一 
方面 , 它 运 用 数学 这 一 工具 把 不 断 增长 的 精确 性 和 复杂 性 这 些 物理 概念 组 织 了 起 来 ; 另 一 方面 ,物理 学 家 为 数学 家 提供 了 
灵感 的 源泉 .” 同 时 ,也 正如 诺 贝 尔 物 理学 奖 获 得 者 荷兰 Utrecht 大 学 Gerard’t Hooft 教授 在 前 言 中 指出 ,物理 世界 与 数学 
世界 之 间 存 在 明显 的 重要 区 别 . 物理 世界 强调 事实 的 “真相? ,无论 “真相 是 什么 ,而 数学 是 纯净 辑 和 纯 推理 的 世界 .在 物理 
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学 中 ,一 个 理论 是 否 能 被 接受 是 由 实验 来 最 后 决定 的 . 物理 学 中 的 方法 论 也 与 数学 不 同 .” 

一 个 广大 读者 所 关注 的 例子 是 天 体 物理 学 家 霍金 是 否 完 美 地 解决 了 黑洞 火 墙 悖 论 ? 起 码 现在 还 没有 定论 ,只 能 算是 
给 出 了 第 三 种 可 能 的 解释 而 已 . 尽管 人 们 对 于 黑洞 的 具体 性 质 还 没有 全 部 了 解 , 但 是 它 作 为 一 种 致密 天 体 的 存在 早已 没有 
争议 ,而 黑洞 火 乒 悖 论 的 中 心 ,仍然 在 于 量子 力学 与 广义 相对 论 的 矛盾 . 量子 力学 把 黑洞 的 视界 定义 为 一 个 神秘 的 .拥有 巨 
大 能 量 的 火 墙 ,广义 相对 论 则 拒绝 承认 在 宇宙 中 存在 这 种 神奇 的 火 墙 ,认为 黑洞 视界 只 是 一 种 数学 上 的 存在 而 已 . 因此 ,要 
想 真正 解决 黑洞 火 墙 悖 论 , 人 类 需要 对 自然 界 有 更 深刻 的 理解 . 霍金 自己 也 承认 ,要 想 真 正 理 解 物质 和 信息 最 终 从 黑洞 中 
逃税 的 原理 ,最 终 需 要 人 们 把 引力 和 自然 界 的 其 他 作用 力 合 而 为 一 ,这 是 一 个 困扰 了 物理 学 家 们 将 近 一 个 世纪 的 难题 ,至 
今 仍然 没有 得 到 解决 . 作为 人 类 现代 文明 的 两 块 基石 ,广义 相对 论 通过 优美 的 数学 形式 描述 宇宙 ,目前 人 们 认为 对 它 已 经 
有 足够 深刻 的 理解 ,而 量子 力学 则 通过 一 种 概率 化 的 形式 描述 微观 世界 , 它 的 内 涵 和 基本 规律 仍然 不 为 人 知 ,就 连 量 子 力 
学 的 创立 者 尼 尔 斯 。 玻 尔 也 说 “没有 人 理解 量子 力学 ”. 黑洞 火 墙 悖 论 是 这 两 种 理论 在 宇宙 深 处 的 交锋 ,而 交锋 的 结果 , 目 
前 仍然 无 法 预料 . 

本 书 在 刚 引 进 中 国 时 曾 有 过 一 个 12 卷 精 装 本 . 以 内 容 划 分 是 一 种 创新 ,这 种 事 出 版 界 常 有 . 

中 央 文 献 研究 室 所 编 4 毛 泽 东 年 谱 (1949 一 1976)》( 中 央 文 献 出 版 社 ,2014) 皇 皇 6 卷 , 是 读者 期 待 已 久 的 一 部 大 书 .不 
贤 者 识 其 小 ,这 里 只 摘抄 一 点 儿 关 于 图 书 装 订 的 内 容 .1965 年 8 月 14 日 ,毛泽东 就 印 一 批 马 列 经 典 大 字 本 问题 指示 周扬 : 
“同意 用 照相 放大 胶印 的 办 法 .但 请 注意 封面 不 用 硬 纸 ;大 书 ( 例 如 《唯物 主义 与 经 验 批判 主义 兴 反 杜 林 论 》)7 过 去 例 作 一 卷 
或 两 卷 , 现 应 分 装 4 卷 或 8 卷 , 使 每 卷 重 量 减 轻 .” 印 大 字 本 ,是 因为 老 同 志 视 力 差 ;封面 不 用 硬 纸 , 就 是 不 要 硬 精 装 , 因 其 不 
方便 单 手 握 卷 、 躺 着 阅读 ; 较 厚 的 书 应 该 多 分 几 灿 (其 实 毛泽东 推举 的 两 本 书 都 在 500 页 以 下 ). 总 体 而 言 , 毛 泽 东 对 大 字 本 
的 这 些 要 求 , 都 是 以 读者 为 本 位 ,以 方便 阅读 为 目的 的 .有 人 说 :当今 出 版 界 在 装订 方面 ,流行 大 开本 \ 大 厚 本 .无 线 胶 订 ,以 
傻 、 大 、 黑 、 粗 为 尚 ,这 种 专门 为 难 读者 的 精神 ,实在 令 人 费解 . 

但 笔者 认为 本 书 绝对 算得 上 是 数学 物理 中 的 经 典 之 作 . 而 向 经 典 致敬 的 方式 各 有 不 同 ,最 传统 、 最 有 效 的 就 是 保持 原 
TRA. 原来 我 们 准备 连 封面 都 拷贝 原版 ,后 与 版 权 代 理 协 商 才 改 成 现在 的 样子 .真正 美好 的 东西 都 一 定 是 增 一 分 则 多 , 减 
一 分 则 少 , 原 来 就 刚刚 好 ,我 们 为 什么 要 破坏 它 呢 ? 难道 我 们 真 的 有 自信 会 使 其 变 得 更 好 吗 , 佛 头 著 炊 与 狗 尾 续 狠 都 会 让 
读者 吐槽 的 ， 

还 有 一 个 原因 使 我 们 一 定 要 保持 原 摇 , 那 就 是 翻译 的 巨大 工作 量 , 我 们 哈尔滨 工业 大 学 出 版 社 地 处 北方 ,远离 经 济 与 
文化 中 心 , 实 在 是 没有 能 力 组 织 宠 大 的 翻译 队伍 , 耗 巨 资 多 年 打磨 这 套 丛 书 .我 们 待 将 来 实力 增强 后 再 购买 中 文 版 权 来 完 
成 这 一 宿 愿 .在 购买 版 权时 我 们 也 表达 了 购买 数字 版 权 的 意向 ,但 被 婉拒 了 ,因为 英文 版 的 数字 出 版 外 方 已 做 得 很 完善 了 ， 
不 像 我 们 刚 起 步 , 而 且 在 碎片 化 之 后 还 面临 着 版 权 保 护 问 题 , 在 辞典 出 版 中 这 是 个 顽疾 . 举 个 例子 : 

认 不 认得 这 个 英文 单词 esquivalience? FAH? 那 你 可 以 去 查 一 下 新 版 的 《新 牛津 美语 词典 )(《New Oxford American 
Dictionary》) ,里 面 会 告诉 你 这 个 词 的 意思 是 :故意 逃避 自己 的 官方 责任 .19 世纪 开始 出 现 , 或 许 是 源 自 法 文 esquiver, * 4& 
3p. EC. 

不 过 如 果 你 拿 起 家 中 案头 的 其 他 词典 ,或 者 将 词 输入 到 各 种 电子 词典 中 ,保证 你 怎么 查 都 查 不 到 这 个 词 , 要 是 你 查 到 
MI PLE GS 

为 什么 会 这 样 ? 因为 这 个 词根 本 就 是 4 新 牛津 美语 词典 》 编 辑 部 发 明 的 ,不 存在 的 词 . 什么 ? 词典 里 竟然 有 虚构 的 词 ? 
编 词 典 的 人 怎么 可 以 干 这 种 事 ? 

词典 里 有 虚构 的 词 , 不 只 《新 牛津 美语 词典 ), 基 本 上 每 一 本 词典 里 都 茂 有 这 种 凭空 创造 的 词 , 放 这 样 的 词 在 词典 里 , 倒 
不 是 出 于 编辑 的 恶作剧 坏 心 ,而 是 有 具体 用 处 的 ， 

这 是 保护 著作 权 的 重要 机 关 . 辛 辛苦 苦 编 出 一 本 厚重 的 词典 ,要 如 何 防止 别人 贪 便宜 ,把 你 的 词典 拿 去 剪 剪 贴 贴 , 改 头 
换 面 就 变 出 他 们 的 词典 呢 ? 词 是 共通 的 , 词 的 意思 解释 也 不 会 有 多 大 的 差别 ,要 怎样 证 明 别 人 的 词典 抄袭 、 盗 取 你 的 内 容 ? 

要 是 esquivalience 这 个 词 出 现在 《新 牛津 美语 词典 》 以 外 的 词典 里 ,就 一 定 牵 涉 到 抄袭 、 盗 取 , 这 个 词 就 是 为 了 找 出 抄 
A. NU X EDEN. 

当前 全 球 出 版 业 都 不 景气 ,特别 是 在 纸 书 出 版 领域 .中国 出 版 业 尤 甚 , 凉 意 十 足 .尽管 各 路 专家 给 出 了 不 同 的 原因 分 
析 . 但 只 有 一 位 专家 给 出 的 答案 令 业 内 所 信服 , 那 就 是 优质 内 容 的 缺失 . 说 到 底 出 版 是 一 个 内 容 为 王 的 产业 ,没有 好 的 内 
容 ,一 切 都 是 无 本 之 源 . 

有 位 作家 说 :平庸 是 这 个 时 代 的 危险 所 在 , 它 无 法 再 吸收 传统 知识 :现代 生活 杂乱 无 章 , 令 人 淹没 无 闻 . 一 切 都 掉 在 浅 
水 中 ,没有 什么 沉 入 深 深 的 井中 ;一 切 都 是 飞 短 流 长 ,一 切 都 是 流言 费 语 ， 

我 们 应 该 敢于 承认 一 个 基本 事实 ,这 个 事实 便 是 一 一 在 这 个 平庸 的 时 代 , 最 坏 的 都 活 下 来 了 ,最 好 的 死去 了 ,我 们 这 些 
还 能 逃生 的 ,发 挥 不 出 真正 的 价值 ,那么 ,在 这 个 平庸 的 时 代 , 我 们 还 能 做 什么 呢 ? 
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HX go x BE oR (Elsevier) 2-8] P 2006 4 6 FH HJ HY 2X (Encyclopedia of Mathematical Physics》(《 数 学 物理 大 百 
科 人 全书》* 这 是 一 部 不 平凡 的 全 面 介 绍 数学 物理 知识 的 百科 全 书 . 

本 书 的 三 位 主编 (法 国 巴 黎 居 里 大 学 Jean-Pierre Francoise 教授 .美国 费城 德 雷 塞 尔 大 学 Gregory L. Naber 教授 和 其 
国 牛 津 大 学 Tsou Sheung Tsun 博士 ) 都 是 长 期 从 事 数 学 物理 方面 研究 的 知名 学 者 .他 们 邀请 了 包括 诺 贝 尔 物 理学 奖 获 得 
者 杨振宁 教授 和 英国 牛津 大 学 Roger Penrose 教授 在 内 的 34 位 著名 物理 学 家 和 和 数学 家 ,作为 本 书 的 编辑 顾问 委员 会 成 员 ， 
组 织 来 自 30 个 国家 的 439 位 在 物理 学 和 数学 相关 研究 领域 做 出 杰出 贡献 的 理论 物理 学 家 和 数学 家 ,撰写 了 400 多 篇 图 文 
并 茂 的 综述 性 文章 ， 

《数学 物理 大 百科 人 全书》 是 经 长 达 4 年 完成 的 一 部 内 容 全 面 系统 .领域 涵盖 广泛 的 百科 人 全书, 全 书 特色 鲜明 , 既 体 现 了 
学 科 的 基础 性 独立 性 、 完 整 性 ,又 注重 学 科 的 前 沿 性 .交叉 性 ` 应 用 性 ,是 当今 数学 物理 研究 领域 最 新 和 最 全 的 百科 全 书 . 

本 书 内 容 涉及 物理 学 和 数学 的 几乎 各 个 重要 研究 领域 ,遍及 从 经 典 力 学 到 量子 力学 、 经 典 场 论 到 量子 场 论 、 共 形 场 论 
到 拓扑 场 论 .流体 动 力学 到 动力 系统 .可 积 系统 到 无 序 系 统 、. 粒子 物理 到 天 体 宇 宙 学 、 相对 论 到 量子 引力 \ 规 范 理论 到 统一 
理论 平衡 态 统计 到 非 平衡 态 统计 ,凝聚 态 物 质 到 量子 信息 、 变 分 技术 到 代数 方法 、 泛 涵 分 析 到 算 子 代 数 、 路 径 积 分 到 随机 
方法 、 李 群 到 量子 群 、 微 分 几何 到 代数 拓扑 、 低 维 几 何 到 非 交 换 几 何 、 复 几何 到 六 几何 等 核心 领域 和 方向 . 本 书 还 特别 注重 
数学 物理 的 最 新 研究 成 果 和 在 各 领域 的 最 新 应 用 ,并 提供 了 大 量 必要 的 和 重要 的 参考 文献 . 

本 书 相 比 一 般 的 百科 全 书 有 一 个 明显 的 亮点 是 它 的 综述 . 它 可 以 告诉 你 你 想 知 道 的 某 个 专题 的 一 切 . 中 国 科 学 院 院士 
赫 柏 林 曾 留学 于 哈 尔 科 夫 大 学 , 据 他 回忆 当时 的 考试 是 由 数学 物理 教授 A. Ya. Povzner 主持 . 他 出 的 题目 是 “把 从 你 生 下 
来 以 后 所 知道 的 贝 塞 尔 函 数 的 一 切 都 告诉 我 .” 据 他 的 学 生 说 x 他 写 了 一 大 操 纸 ,密密麻麻 ,然后 告诉 Povzner" 这 是 我 知道 
的 关于 贝 塞 尔 函 数 知 识 的 提纲 .若是 需要 ,我 可 以 展开 每 一 项 的 具体 内 容 . "于 是 考试 通过 , 

正如 Gerard’t Hooft 所 指出 的 那样 : 

数学 物理 这 个 交叉 学 科 是 非常 难 懂 的 . 百科 全 书 中 的 某 些 题目 纯粹 是 物理 的 ,高 TI. 超 导 电 性 .破坏 水 波 和 磁 水 动力 是 
完全 物理 的 题目 ,其 中 的 实验 数据 比 任何 高 深 理论 都 具有 决定 性 .然而 ,上 同调 理论 .Donaldson-Witten 理论 和 AdS/CFT 
对 应 是 纯 数 学 的 例子 . 

在 编辑 中 ,大量 不 同 作者 的 短小 文章 不 可 避免 地 被 做 了 适当 的 变动 .在 这 本 百科 全 书 中 ,理论 物理 学 家 和 数学 家 为 高 
等 数学 物理 中 的 许多 重要 条 目 做 了 简单 明了 的 阐述 . 所 有 的 文章 都 包含 了 供 进 一 步 阅读 的 参考 文献 .我 们 盼望 这 些 努 力 会 
取得 很 好 的 效果 . 

本 书 的 编者 认为 : 

与 狭义 的 数学 和 物理 学 的 古老 历史 相 比 ,数学 物理 是 一 门 相 对 较 新 的 独立 学 科 . 数学 物理 国际 协会 成 立 于 1976 年 . 当 
然 , 从 十 时 候 起 数学 与 物理 学 就 相互 影响 : 但 近 几 十 年 来 ,可 能 因为 我 们 正身 在 其 中 ,它们 出 现 了 巨大 的 进展 ,新 的 结果 和 
观点 以 令 人 目眩 的 节奏 诞生 ,以 至 于 需要 有 一 本 百科 全 书 来 搜集 整理 这 些 知 识 ， 

数学 物理 把 数学 和 物理 学 这 两 个 大 学 科 的 优势 集中 到 一 起 ,它们 的 关系 是 共同 发 展 .一 方面 , 它 运 用 数学 这 一 工具 把 
不 断 增长 的 精确 性 和 复杂 性 这 些 物 理 概 念 组 织 了 起 来 ; 另 一 方面 ,物理 学 家 为 数学 家 们 提供 了 灵感 的 源泉 .两 者 关系 的 经 
典 例子 是 爱国 斯 坦 的 相对 论 , 其 中 微分 几何 在 物理 理论 的 公式 化 方面 起 到 了 实质 性 的 作用 ,而 物理 学 相继 提出 的 问题 推动 
了 微分 几何 的 发 展 .巧合 的 是 , 当 我 们 在 为 《数学 物理 大 百科 全 书 》 写 序言 时 , 正 值 爱 因 斯 坦 创造 奇迹 100 周年 . 

再 三 考虑 到 写 这 部 《数学 物理 大 百科 全 书 》 是 一 个 艰巨 的 项 目 .如 果 不 是 坚信 这 是 一 项 很 有 意义 的 .受益 于 社会 的 项 
目 , 而 且 我 们 会 得 到 众多 的 支持 ,那么 我 们 绝 不 会 接受 这 个 任务 .我 们 确实 获得 了 许多 支持 ,包括 建议 .鼓励 和 有 实用 性 的 
帮助 ,这 些 支持 来 自 编辑 顾问 委员 会 成 员 和 我 们 的 作者 ,还 有 其 他 慷慨 地 抽 时 间 帮 我 们 完善 这 本 百科 全 书 的 人 . 

数学 物理 是 一 门 较 新 的 学 科 , 它 还 没有 被 清晰 地 刻画 ,不 同 的 人 对 它 有 不 同 的 理解 .在 我 们 选择 的 题目 中 ,一 部 分 遵循 
了 近期 数学 物理 国际 大 会 的 纲要 ,但 主要 参照 编辑 顾问 委员 会 和 作者 的 提议 . 由 于 时 间 和 空间 的 限制 ,以 及 我 们 自身 的 水 
FHR ,更改 了 某 些 宛 长 的 题目 ,但 我 们 尽量 收录 了 我 们 认为 是 核心 的 课题 ,尽量 覆盖 更 多 的 最 活跃 的 领域 . 

近年 在 中 国 对 本 书 的 原 出 版 商 还 是 有 些 负面 新 闻 的 ,起 源 是 在 美国 一 人 知识 的 代价 "网 站 上 ,已 有 全 球 12 196 位 
科学 家 签名 抵制 这 家 世界 上 最 大 的 出 版 商 . 有 人 用 “学 术 之 春 ” 形 容 这 场 运 动 . 

吹 响 号 角 的 是 大 名 此 回 的 英国 数学 家 威廉 。 提 摩西 。 高 尔 斯 (William Timothy Gowers). 3X f X B SHAS HERE 
奖 得 主 曾 发 表 了 一 篇 博客 文章 ,号 召 同行 行动 起 来 ,抵制 世界 上 最 大 的 出 版 商 爱 轧 唯 尔 集 团 . 

读 到 这 篇 博文 的 泰勒。 内 伦 (Tyler Neylon) 一 一 一 位 目前 在 硅谷 开 公 司 的 数学 博士 当即 给 高 尔 斯 教授 留 了 言 .第 二 
天 ,他 建立 了 一 个 网 站 ,命名 为 “知识 的 代价 ”. 

泰勒 事后 回忆 ,自己 读 到 那 篇 博文 ,就 意识 到 可 以 做 点 什么 .在 他 看 来 ,高 尔 斯 是 一 位 拥有 号 召 力 的 “超级 明星 ”. 

运 今 为 止 , 数 万 名 科学 家 在 泰勒 的 网 站 上 签 了 名 .他们 发 白 , 不 在 爱 思 唯 尔 旗下 的 期 刊 发 论文 ,不 做 审 稿 人 ,或 者 不 担 
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尽管 如 此 ,我 们 还 是 选择 了 与 爱 思 唯 尔 的 合作 ,因为 一 套 好 的 大 百科 太 难 得 了 . 

旅 法 钢琴 大 师 白 建 字 (Kun-Woo Paik) 对 人 钢琴 的 要 求 非常 苛刻 ,他 在 一 次 与 台湾 出 版 人 郝 明 义 先生 的 谈话 时 说 ,弹琴 弹 
到 现在 ,职业 演奏 生涯 超过 半 个 世纪 ,所 过 到 满意 的 琴 竟 不 超过 5 架 , 如 此 管 案 , 令 见 多 识 广 的 郝 先 生 也 大 吃 一 惊 . 

在 数理 方面 ,近年 来 国内 引进 的 好 的 大 百科 也 绝 不 会 超过 5 部 ,前 苏联 五 卷 本 的 《数学 大 百科 全 书 》 算 一 部 ,日 本 岩 波 
的 《数学 百科 全 书 ) 算 一 部 ,总 之 是 届 指 可 数 . 

其 实 这 个 项 目 并 不 是 爱 思 唯 尔 创始 的 , 据 介 绍 , 这 个 项 目 开 始 于 Academic Press, 后 来 由 爱 思 维尔 接手 ,他们 热情 的 工 
作 人 员 , 把 过 渡 工 作 做 得 天 衣 无 恕 . 并 且 令 人 感动 的 是 ,相当 一 部 分 作者 慷慨 地 把 他 们 的 酬劳 捐赠 给 欧洲 数学 会 的 发 展 中 
国家 委员 会 ,我 们 应 该 感谢 他 们 为 发 展 中 国家 所 做 的 一 切 . 

至 于 我 们 最 关心 的 问题 : 谁 会 去 购买 这 样 一 套 大 书 , 我 们 充满 乐观 . 大 千 世 界 无 背 不 有 ,各 种 购买 方式 都 可 能 出 现 . 前 
一 阵子 ,有 关 霍 金 打赌 输 掉 关 于 “上 帝 粒 子 ” 存 在 性 的 赌 约 报道 很 多 . 

实验 证 明 霍 金 输 掉 了 这 场 财 约 ， 专 例 坦承 自己 输 得 心服 口服 并 祝愿 希 阁 斯 获得 诺 贝尔 奖 希 格 斯 透露 ,在 宣布 发 现 新 
粒子 后 ,霍金 曾 与 他 联系 并 表示 支票 已 寄 出 . 希 格 斯 说 ,他 不 仅 是 给 我 一 个 人 钱 .我 想 他 还 会 寄 100 美元 给 密 歌 根 大 学 的 
Ree me 

这 场 赌 约 的 另 一 位 赢家 凯 因 对 来 自 霍 金 的 美元 欣然 接受 .“ 我 坚信 希 格 斯 玻 色 子 一 定 会 被 找到 . 发 现 希 格 斯 玻 色 子 真 
EKET. 它 证 实 了 长 久 以 来 的 猜想 ,进一步 加 强 了 粒子 物理 ' 标 准 模型 的 事实 根据 . 打赌 获胜 是 锦上添花 .” 凯 思 表 示 要 
把 赢 来 的 钱 花 在 刀 玉 上 ,所 有 的 钱 都 要 用 于 搞 研究 . 

霍金 可 能 已 经 习惯 了 以 输 掉 赌 约 的 方式 推进 科学 的 普及 . 

1975 年 ,霍金 曾 关 于 天 蝎 座 X 一 1 是 否 包含 黑洞 打赌 ,后 来 认输 ,为 赢家 订阅 了 1 年 的 ¢ 阁 楼) 杂志. 

199] 年 ,霍金 又 与 人 赌 上 了 ,这 次 赌 的 是 裸 奇 点 是 否 存 在 ,霍金 再 次 输 了 ， 

第 三 次 打赌 发 生 在 1997 年 ,霍金 同 美国 物理 学 家 约 输 。 普 雷 斯 基 尔 打赌 ,认为 黑洞 部 不 会 挫 毁 它们 吞 哄 的 一 切 信 息 ， 
ESF 2004 年 7 月 21 日 当众 表示 输 掉 了 这 场 财 约 ,并 送 给 普 雷 斯 基 尔 一 套 板 球 百科 全 书 . 

关于 希 格 斯 玻 色 粒子 的 赌 约 则 是 他 的 第 四 场 赌 约 .这 30 多 年 来 ,霍金 通过 杂志 ,书籍 和 一 点 点 美元 ,让 更 多 的 人 了 解 
到 这 些 科学 最 前 沿 的 问题 .在 100 美元 的 财 约 普 后 , 希 格 斯 的 远见 和 霍 爹 的 暂 牧 精神 都 值得 称道 . 

我 们 期 待 下 一 个 赌 约 会 以 这 样 一 套 百 科 全 书 来 结束 . 

著名 力学 家 周 培 源 90 岁 生日 时 ,北京 大 学 爹 体 师 生 用 “献身 科学 ,教育 英才 ; 功 在 国家 ,造福 将 来 ; 寿 齐 党 岱 , 德 被 春 
ERK MARA RRM HERK. 斗 胆 借 用 一 下 ,庆祝 这 套 书 在 中 国 的 出 版 , 当 不 为 过 . 
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